secureNLP

An attempt at SemEval 2018's shared task 8 (SecureNLP). This project addresses parts one and two of this shared task:

Classification of sentences as being relevant or not to the task of extracting information about malware capabilities.
Structure prediction of sentences (in BIO format) for Entities, Actions, and Modifiers containing information about malware capabilities.

For more information see the official page. Access to the data set can also be found through this website (although you do have to contact the administrators)

Results can be seen here. As of 9/5/2018, (compared to scores from the evaluation period) highest score for Subtask 2 relaxed score, 3rd place in Subtask 1, and 4th place for Subtask 2 strict score

To run this project:

Obtain the data set
Make sure all file locations for the data are accurate in config.py
Run data_process.py
Run the first task with sent_classification_sgd.py
Run the second task with entity_recognition_crf.py

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
config.py		config.py
data_process.py		data_process.py
entity_recognition_crf.py		entity_recognition_crf.py
sent_classification_sgd.py		sent_classification_sgd.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

secureNLP

About

Releases

Packages

Languages

lukedorney/secureNLP

Folders and files

Latest commit

History

Repository files navigation

secureNLP

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages