Assemblage

Assemblage is a distributed binary corpus discovery, generation, and archival tool built to provide high-quality labeled metadata for the purposes of building training data for machine learning applications of binary analysis and other applications (static / dynamic analysis, reverse engineering, etc...).

You can now find our paper on arxiv

Deployment and Dataset Availability

A brief introduction to the APIs is provided at this link, and deployment instructions can be found here

We include only the subset of binaries for which permissive licenses can be ascertained, please checkout our data sheet.
For up to date info and download, please visit the dataset page

_{The code in this repository is published under MIT license.}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
assemblage		assemblage
aws		aws
docker		docker
example_workers		example_workers
script		script
.dockerignore		.dockerignore
.gitignore		.gitignore
.pylintrc		.pylintrc
.python-version		.python-version
README.md		README.md
build.sh		build.sh
cli.py		cli.py
docker-compose.yml		docker-compose.yml
example_cluster.py		example_cluster.py
pre_build.sh		pre_build.sh
requirements.txt		requirements.txt
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Assemblage

Deployment and Dataset Availability

About

Releases

Packages

Languages

Assemblage-Dataset/Assemblage

Folders and files

Latest commit

History

Repository files navigation

Assemblage

Deployment and Dataset Availability

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages