Skip to content

This is the repository that introduces research topics related to protecting intellectual property (IP) of AI from a data-centric perspective. Such topics include data-centric model IP protection, data authorization protection, data copyright protection, and any other data-level technologies that protect the IP of AI.

Notifications You must be signed in to change notification settings

conditionWang/Data_Centric_AI_IP_Protection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 

Repository files navigation

Data-Centric AI Intellectual Property Protection

This repository introduces research topics related to protecting the intellectual property (IP) of AI from a data-centric perspective. Such topics include data-centric model IP protection, data authorization protection, data copyright protection, and any other data-level technologies that protect the IP of AI. More content is coming, and in the end, we care about your uniqueness!!!

Data-Centric Model Protection

Verify your ownership of a certain model via certain data and authorize the usage of your model to certain data.

Image Data

  • Non-Transferable Learning: A New Approach for Model Ownership Verification and Applicability Authorization
  • Model Barrier: A Compact Un-Transferable Isolation Domain for Model Intellectual Property Protection
  • Domain Specified Optimization for Deployment Authorization

Other Data

  • Unsupervised Non-transferable Text Classification

Data Authorization Protection (namely unlearnable data or examples)

Prevent unauthorized data usage of model training, usually achieved by decreasing the model performance via poisoning attacks.

Image Data

  • Unlearnable Examples: Making Personal Data Unexploitable
  • Going Grayscale: The Road to Understanding and Improving Unlearnable Examples
  • Robust Unlearnable Examples: Protecting Data Against Adversarial Learning
  • Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors
  • Transferable Unlearnable Examples
  • LAVA: Data Valuation without Pre-Specified Learning Algorithms
  • Unlearnable Clusters: Towards Label-Agnostic Unlearnable Examples
  • Universal Unlearnable Examples: Cluster-wise Perturbations without Label-consistency
  • Unlearnable Examples Give a False Sense of Security: Piercing through Unexploitable Data with Learnable Examples
  • Towards Generalizable Data Protection With Transferable Unlearnable Examples
  • CUDA: Convolution-Based Unlearnable Datasets
  • Raising the Cost of Malicious AI-Powered Image Editing
  • Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks
  • Towards Generalizable Data Protection With Transferable Unlearnable Examples
  • The Devil's Advocate: Shattering the Illusion of Unexploitable Data using Diffusion Models
  • GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models
  • Flew Over Learning Trap: Learn Unlearnable Samples by Progressive Staged Training
  • Segue: Side-information Guided Generative Unlearnable Examples for Facial Privacy Protection in Real World
  • What Can We Learn from Unlearnable Datasets?

Other Data

  • Unlearnable Examples: Protecting Open-Source Software from Unauthorized Neural Code Learning
  • WaveFuzz: A Clean-Label Poisoning Attack to Protect Your Voice
  • Unlearnable Graph: Protecting Graphs from Unauthorized Exploitation
  • Securing Biomedical Images from Unauthorized Training with Anti-Learning Perturbation
  • UPTON: Unattributable Authorship Text via Data Poisoning
  • GraphCloak: Safeguarding Task-specific Knowledge within Graph-structured Data from Unauthorized Exploitation
  • Make Text Unlearnable: Exploiting Effective Patterns to Protect Personal Data

Data Copyright Protection

Verify your ownership of certain data via black-box model access.

Image Data

  • Radioactive data: tracing through training
  • Tracing Data through Learning with Watermarking
    • [paper]
    • Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security
  • On the Effectiveness of Dataset Watermarking
    • [paper]
    • Proceedings of the 2022 ACM on International Workshop on Security and Privacy Analytics
  • Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection
  • Did You Train on My Dataset? Towards Public Dataset Protection with Clean-Label Backdoor Watermarking
  • On the Effectiveness of Dataset Watermarking in Adversarial Settings
    • [paper]
    • Proceedings of CODASPY-IWSPA 2022
  • Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Networks
  • Data Isotopes for Data Provenance in DNNs
  • Watermarking for Data Provenance in Object Detection
    • [paper]
    • 2022 IEEE Applied Imagery Pattern Recognition Workshop (AIPR)
  • Reclaiming the Digital Commons: A Public Data Trust for Training Data
  • MedLocker: A Transferable Adversarial Watermarking for Preventing Unauthorized Analysis of Medical Image Dataset
  • How to Detect Unauthorized Data Usages in Text-to-image Diffusion Models
  • FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models
  • Domain Watermark: Effective and Harmless Dataset Copyright Protection is Closed at Hand
  • DiffusionShield: A Watermark for Copyright Protection against Generative Diffusion Models

Other Data

  • CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning
  • CodeMark: Imperceptible Watermarking for Code Datasets against Neural Code Completion Models
  • Watermarking Classification Dataset for Copyright Protection

About

This is the repository that introduces research topics related to protecting intellectual property (IP) of AI from a data-centric perspective. Such topics include data-centric model IP protection, data authorization protection, data copyright protection, and any other data-level technologies that protect the IP of AI.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published