[FEATURE REQUEST] Archive of Classes #287

whitead · 2023-01-01T16:14:47Z

Description

I'm not sure if this makes sense for the ~~QDL~~ QD algorithm, but I would like to generate diversity for multi-label classification. That is, each sample x has a binary vector of labels (length ~15-50). I would like to generate a diversity of these binary vectors. I can probably just use the CVTArchive with one-hot vectors, but I was curious if there is a better way, or if this will make sense with the ~~QDL~~ QD algorithm.

Thank you!

The text was updated successfully, but these errors were encountered:

btjanaka · 2023-01-01T17:19:57Z

Hi @whitead, thanks for using pyribs. I think it may help to first formalize your problem a bit so that I can better understand it. What are your objectives, measures / behaviors, and solution vectors? Right now, it seems like your binary vector of labels may either be the measures or the solution vectors.

I am also not familiar with the QDL algorithm; would you mind sharing the relevant paper?

whitead · 2023-01-01T17:42:44Z

Thanks for the quick response! Sorry QDL meant Quality Diversity. Not sure what the L was supposed to stand for in my head...

In my setting, the features/actions x are 2-3 points in a low-dimensional space. Kind of like the robot arm example. To be concrete, take x be 3 2D points, written as x = [-2, 3, 2.5, 1, 0, 3].

The objective is a value between 0 and 1.

After running my simulator given x, it returns both the objective and a set of integers representing properties. Examples of return values for the properties are (1,5,3) or () or (1,27,4) (order is irrelevant). These properties are like multi-label classification. I would like my behavior/archive to be these properties. So I could represent them as a binary vector - (1,5,3) => [0, 1, 0, 1, 0, 1, 0, ....] where the length is the number of possible properties (~15-100 depending on simulator parameters).

I was thinking then my archive could be a grid, where the two bins are 0/1, but that might not behave well with D = 100 I think. Another possibility is the CVTArchive - however I may need to replace the distance with a binary-vector distance measure (e.g., Jaccard Index).

My question for you is if this sounds reasonable and have you come across a case where the behaviors are binary vectors. And, if CVTArchive is reasonable, how I might modify the distance for comparing the vectors. Thanks!

btjanaka · 2023-01-01T18:19:29Z

Your approach certainly sounds reasonable. If you have a 100-dimensional archive, CVTArchive is a good choice. I'm not sure how useful our CMA-based algorithms will be since we have not really used them with binary behaviors, only discrete ones, but it is certainly worth a try. I'd suggest running with the bounds set to (0, 1) in each dimension. Maybe try out CMA-ME (I assume you're using v0.4.0, but we'll have CMA-MAE in v0.5.0) and MAP-Elites. I would not suggest modifying the CVTArchive's distance metric immediately as that would be very involved (we designed everything around Euclidean distance in that archive).

whitead · 2023-01-06T02:21:07Z

Thanks for your help @btjanaka - I'll reopen if I have additional questions.

whitead added the enhancement New feature or request label Jan 1, 2023

whitead closed this as completed Jan 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE REQUEST] Archive of Classes #287

[FEATURE REQUEST] Archive of Classes #287

whitead commented Jan 1, 2023 •

edited

btjanaka commented Jan 1, 2023

whitead commented Jan 1, 2023

btjanaka commented Jan 1, 2023

whitead commented Jan 6, 2023

[FEATURE REQUEST] Archive of Classes #287

[FEATURE REQUEST] Archive of Classes #287

Comments

whitead commented Jan 1, 2023 • edited

Description

btjanaka commented Jan 1, 2023

whitead commented Jan 1, 2023

btjanaka commented Jan 1, 2023

whitead commented Jan 6, 2023

whitead commented Jan 1, 2023 •

edited