Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to inspect/compare extracted features at scale? #13

Closed
rbroc opened this issue Jun 29, 2023 · 6 comments
Closed

how to inspect/compare extracted features at scale? #13

rbroc opened this issue Jun 29, 2023 · 6 comments

Comments

@rbroc
Copy link
Owner

rbroc commented Jun 29, 2023

Currently, even with TextDescriptives only (i.e., no additional feature sets for cognitive features) we have dozens of features and it is unfeasible to look at their distributions individually, both in the context of simple data exploration and for actual modeling purposes.

There are a few options to deal with exploding dimensionality here:

  • Do nothing, and simply feed raw features to tree-based models for AI vs. human text discrimination, looking at which features are most predictive post hoc;
  • Apply dimensionality reduction a priori, e.g., through PCA, then visualize/compare (as a function of prompts, or as a function of whether text is produced by humans or models). This reduces dimensionality, but potentially also reduces interpretability;
  • Compare features one-by-one with statistical tests, e.g., across prompt types, or across human vs. models;

On top of this, additional output-driven dimensionality reduction (e.g., LASSO) could be applied to select for the feature set that is a) most affected by prompting; or b) most discriminative between human- and model-generated text

@rbroc
Copy link
Owner Author

rbroc commented Aug 31, 2023

quick status update on this. currently considering to:

  • select best prompt (among, e.g., top-3) based on a subsample of completions generated by 7b models, by choosing those that generate TextDescriptives metrics whose distribution is closest to human completions;
  • validate best prompt with human annotations, simply checking that summaries / paraphrases / completions produce are "acceptable", and possibly as acceptable as human completions;

@MinaAlmasi
Copy link
Collaborator

MinaAlmasi commented Aug 31, 2023

In relation to this:

select best prompt (among, e.g., top-3) based on a subsample of completions generated by 7b models, by choosing those that generate TextDescriptives metrics whose distribution is closest to human completions;

Sounds good! We still need to decide what features to focus on (or how to cluster them) and what criteria we want for the comparisons (are we doing any statistical testing for validation as you initially wrote as an idea?).

@rbroc
Copy link
Owner Author

rbroc commented Aug 31, 2023

one possible approach we discussed earlier would be to use multidimensional distance metrics for selection -- something along the lines of: a) doing PCA on the full feature set; b) then computing euclidean distances on rotated features (maybe with some weighting to account for different features capturing a different amount of variance), and selecting the prompt that yields the lowest average distance from the human completions.

this could be a valid approach, though partly circular (or too adversarial) if we use the same features to build a classifier that discriminates between human- and AI-generated, so we may still consider alternative approaches based on hand-picking or a tiny bit human validation, or even using ROUGE-like scores for selection?

if this is not operationally urgent, we can maybe defer this to a little meeting/sprint with the whole team involved

@MinaAlmasi
Copy link
Collaborator

if this is not operationally urgent, we can maybe defer this to a little meeting/sprint with the whole team involved

I don't think it is urgent currently, so that sounds like a plan!

@rbroc
Copy link
Owner Author

rbroc commented Mar 15, 2024

an update for future selves, as it's been a long time since this has been opened. we have decided to go for minimal engineering of prompts -- they should be reasonable instructions, and generic enough to be reasonable prompts for multiple models.
we used PCA-rotated length-based features from TextDescriptives to inspect data generated with multiple prompts, and choose the prompts that deliver fewest outliers. there could have been more principled ways of doing this -- e.g., going for the prompts which were closest in the full feature space to human completions, but the logic of this would have been circular (use same features for adversarial prompt selection, and classification).

@rbroc
Copy link
Owner Author

rbroc commented Mar 15, 2024

also closing because not needing action anymore

@rbroc rbroc closed this as completed Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants