Add support for ConfTr #55

pat-alt · 2023-02-15T05:14:12Z

This ICLR 2022 paper shows how to train conformal classifiers.

Add losses for prediction step (prediction step)
Streamline (need separate score method for dealing with MLJFlux) - done in b4c7140
Add support for differentiable quantile computations (calibration step)
Implement batch training procedure
Test and document

The text was updated successfully, but these errors were encountered:

pat-alt · 2023-02-15T15:16:04Z

Some questions that have come up so far:

Is the Direc delta really supposed to be an indicator function? Equation (5) on page 5. Maybe I'm just not familiar with this notation.
Doesn't the smooth size loss depend a lot on the scale of the (non-)conformity scores? For $E_{\theta}(x,k)=\pi_{\theta,k}(x)\in[0,1]$, for example, we have that $\sigma(E_{\theta}(x,k) - \tau) \in [0.27,0.73]$. We can use temperature scaling, but can we really speak of 'probabilities' that labels are assigned to the set?
More on smooth size loss: What about empty sets? Shouldn't they be penalised at least as heavily as complete sets?
a. Could just penalise these cases as $K - \kappa$, that is the maximum set size minus the target set size (1).
b. Perhaps even better: penalise $\sum(1-C) - \kappa$, that is the total sum of probabilities that labels are not assigned to $C$.
As for the smooth quantile computation, it seems that Zygote.jl's AD actually let's me compute grads as long as I sort values beforehand (see this answer on SO). Is this suprising?

@davidstutz would much appreciate your thoughts, if you get the chance. This is still early stages here, so there's absolutely no rush. Amazing paper by the way!!

davidstutz · 2023-02-16T21:10:06Z

Re:

You can find a reference implementation for Equation (5) here.
Reference implementation for that is here - but this does indeed depend on the scale. That's what the temperature term $T$ is for: $\sigma((E_\theta(x, k) - \tau)/T)$. Also, you can use the log-probabilities $E_\theta(x, k) = \log \pi_{\theta,k}(x)$ which works a bit better in practice.
They can be penalized but this is generally not necessary. Basically, as long as there is one true label for each example, and $\alpha$ is reasonably low, the majority of prediction sets will contain at least the true label (so not be empty). This is mainly a result of the simple conformity score (for other conformal predictors this can be different). Beyond that, you are of course free to penalize that, but I am just saying that it is generally not required to learn good classifiers.
Gradients wrt. to what is the question? Generlly, gradient is not a problem as long as the sorting is fixed. The key is getting gradients through the sorting - this is what the smooth sorter is for.

Hope that helps. If I am slow to respond on here, feel free to send me an email to follow-up - always curious to see what people do with conformal training especially as I had some follow-up ideas but couldn't really pursue them.

pat-alt · 2023-02-17T08:46:36Z

Wow this was quick, thanks a lot 🙏

That all makes sense. Regarding the quantile computation, thanks for the clarification. For my current use case, I just need to differentiate with respect to a conformal model that has already been calibrated, but I see now why you need information about the sorting itself for training.

Thanks again for being responsive!

pat-alt added enhancement New feature or request difficult This is expected to be difficult. labels Feb 15, 2023

pat-alt self-assigned this Feb 15, 2023

pat-alt added the CCE 💯 label Mar 8, 2023

pat-alt mentioned this issue Mar 31, 2023

55 add support for conftr #60

Merged

pat-alt linked a pull request Mar 31, 2023 that will close this issue

55 add support for conftr #60

Merged

pat-alt closed this as completed in #60 Apr 25, 2023

pat-alt mentioned this issue Aug 11, 2023

62 full module for conformal training #81

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for ConfTr #55

Add support for ConfTr #55

pat-alt commented Feb 15, 2023 •

edited

pat-alt commented Feb 15, 2023 •

edited

davidstutz commented Feb 16, 2023

pat-alt commented Feb 17, 2023

Add support for ConfTr #55

Add support for ConfTr #55

Comments

pat-alt commented Feb 15, 2023 • edited

pat-alt commented Feb 15, 2023 • edited

davidstutz commented Feb 16, 2023

pat-alt commented Feb 17, 2023

pat-alt commented Feb 15, 2023 •

edited

pat-alt commented Feb 15, 2023 •

edited