Skip to content

elerdg/NER-BERT-Italian

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Named Entity Recognition with BERT on Italian

This project aims to fine-tune pre-trained BERT models for named-entity recognition (NER) on Italian Data from the Wikineural Dataset

Overview:

The dataset:

Wikineural IT comprises 111k sentences from Wikipedia, tokenized and ner tagged. The Dataset is organized in 3 splits: train, test, and validation. The sentences are cased and contain punctuation. The entity categories are encoded as illustrated below:

{'O': 0, 'B-PER': 1, 'I-PER': 2, 'B-ORG': 3, 'I-ORG': 4, 'B-LOC': 5, 'I-LOC': 6, 'B-MISC': 7, 'I-MISC': 8}

The pre-trained models in this project:

Releases

No releases published

Packages