feat(MachineLearning/PACLearning): add VersionSpace abstraction#2
Open
Zetetic-Dhruv wants to merge 1 commit intoSamuelSchlesinger:feat/pac-learning-defsfrom
Open
feat(MachineLearning/PACLearning): add VersionSpace abstraction#2Zetetic-Dhruv wants to merge 1 commit intoSamuelSchlesinger:feat/pac-learning-defsfrom
Zetetic-Dhruv wants to merge 1 commit intoSamuelSchlesinger:feat/pac-learning-defsfrom
Conversation
Adds the classical version space abstraction (Mitchell 1982, Angluin 1980) to the PAC learning module. Version space — the subset of a concept class consistent with observed labeled data — is the structural substrate every sample complexity theorem operates on (Baby PAC, Sauer-Shelah, PAC lower bounds, NFL). This PR complements leanprover#492 by providing: - VersionSpace C S: the consistent subset of C given sample S - versionSpace_subset, versionSpace_empty_sample: sanity lemmas - versionSpace_antitone: structural monotonicity (more data → smaller VS), dual to sample complexity monotonicity - IsConsistent A C: predicate on learners (output always in version space) - IsConsistent.output_mem_conceptClass, output_consistent: consistent learner properties - mem_versionSpace_of_realizable: under realizable data, the target concept lies in the version space (the realizable-case bridge) - versionSpace_nonempty_of_realizable: corollary No measure theory, no new Mathlib dependencies, ~150 lines, 0 sorry. Together with leanprover#492 these suffice to state the Baby PAC theorem, Sauer-Shelah sample complexity, PAC lower bounds, and NFL as structural statements rather than ad-hoc computations.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds the classical version space abstraction (Mitchell 1982, Angluin 1980) as a companion to the PAC learning definitions in leanprover#492.
Contents:
VersionSpace C S: the subset ofCwhose concepts agree withSon every sample pointversionSpace_subset,versionSpace_empty_sample: sanity lemmasversionSpace_antitone: more data gives a smaller version spaceIsConsistent A C: predicate on learners whose output always lies in the version spaceIsConsistent.output_mem_conceptClass,IsConsistent.output_consistent: derived propertiesmem_versionSpace_of_realizable,versionSpace_nonempty_of_realizable: realizable-case bridgeThis is required for PAC lower bound, infinite NFL and Shauer Shelah proofs. I feel version space and complexity go hand in hand. If you feel this is not fully relevant, I will add this separately.
But this is a short addon and might be helpful to satisfy Shreyas. Let me know