Skip to content

Source Code and Data for Software Domain NER

License

Notifications You must be signed in to change notification settings

superCui7/StackOverflowNER

 
 

Repository files navigation

ECE-GY-6143 Machine Learning Final Project

Code and Named Entity Recognition on StackOverflow

This work is based on the paper and the repository.

The modified version of utils_fine_tune is here.

Please see the details in the report.

Dataset and Model for Fine-grained Software Entity Extraction

This repository contains all the code and data proposed in the paper: Code and Named Entity Recognition in StackOverflow. (ACL 2020). [Paper PDF]

For the source code of our NER tagger, check the code/NER/ folder.

For our annotated data with software-domain named entities, check the resources/annotated_ner_data/ folder.

To cite the data or the code included in this repository, please use the following bibtex entry:

  @inproceedings{Tabassum20acl,
      title = {Code and Named Entity Recognition in StackOverflow},
      author = "Tabassum, Jeniya and Maddela, Mounica and  Xu, Wei  and Ritter, Alan",
      booktitle = {The Annual Meeting of the Association for Computational Linguistics (ACL)},
      year = {2020}
  }

About

Source Code and Data for Software Domain NER

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 76.5%
  • Jupyter Notebook 21.3%
  • Perl 2.2%