Skip to content

Dataset Release on our paper, Threat Behavior Textual Search by Attention Graph Ismorphism

Notifications You must be signed in to change notification settings

cwbae10-purdue/CTI-EACL24

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 

Repository files navigation

Cyber Threat Intelligence Dataset

This is a dataset repository which is used for model training, evalutaions as a part of our CTI research;

Threat Behavior Textual Search by Attention Graph Isomorphism (Bae et al., EACL 2024)

The dataset consists of pretraining dataset, threat reports per APT groups and a collector tool (which I use for all of this collection, needed to update new reports after our work).

Large-scale Pretraining, Threat Reports Corpus Dataset

  • Textual corpus of threat reports

  • Collected from 8 vendors

Threat Reports, Classified by APT Groups

  • A collection of threat reports by APT groups

  • Our evaluation set is well-filtered, manually-verified set

  • We also provide the copied list from two public websites (Malpedia, ThaiCERT).

Threat Report Collector

  • Our dataset is as of 2022. 06, we will be releasing our collector as a tool (working on... will be uploaded soon)

MISC

  • Copyrights of all dataset belong to original authors or their vendors.

  • Any misuse of attack information is strictly prohibited.

  • Please contact us (Chanwoo Bae, bae68@purdue.edu) for any questions.

  • We kindly request to cite our paper with your use of dataset.

About

Dataset Release on our paper, Threat Behavior Textual Search by Attention Graph Ismorphism

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published