-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
113 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
\\ | ||
arXiv:2405.20435 (*cross-listing*) | ||
Date: Thu, 30 May 2024 19:26:51 GMT (1563kb,D) | ||
|
||
Title: Deep Learning for Computing Convergence Rates of Markov Chains | ||
Authors: Yanlin Qu, Jose Blanchet, Peter Glynn | ||
Categories: cs.LG math.PR stat.ML | ||
\\ | ||
Convergence rate analysis for general state-space Markov chains is | ||
fundamentally important in areas such as Markov chain Monte Carlo and | ||
algorithmic analysis (for computing explicit convergence bounds). This problem, | ||
however, is notoriously difficult because traditional analytical methods often | ||
do not generate practically useful convergence bounds for realistic Markov | ||
chains. We propose the Deep Contractive Drift Calculator (DCDC), the first | ||
general-purpose sample-based algorithm for bounding the convergence of Markov | ||
chains to stationarity in Wasserstein distance. The DCDC has two components. | ||
First, inspired by the new convergence analysis framework in (Qu et.al, 2023), | ||
we introduce the Contractive Drift Equation (CDE), the solution of which leads | ||
to an explicit convergence bound. Second, we develop an efficient | ||
neural-network-based CDE solver. Equipped with these two components, DCDC | ||
solves the CDE and converts the solution into a convergence bound. We analyze | ||
the sample complexity of the algorithm and further demonstrate the | ||
effectiveness of the DCDC by generating convergence bounds for realistic Markov | ||
chains arising from stochastic processing networks as well as constant | ||
step-size stochastic optimization. | ||
\\ ( https://arxiv.org/abs/2405.20435 , 1563kb) | ||
------------------------------------------------------------------------------ | ||
\\ | ||
arXiv:2405.20452 (*cross-listing*) | ||
Date: Thu, 30 May 2024 19:58:01 GMT (927kb,D) | ||
|
||
Title: Understanding Encoder-Decoder Structures in Machine Learning Using | ||
Information Measures | ||
Authors: Jorge F. Silva and Victor Faraggi and Camilo Ramirez and Alvaro Egana | ||
and Eduardo Pavez | ||
Categories: cs.LG cs.IT math.IT stat.ML | ||
\\ | ||
We present new results to model and understand the role of encoder-decoder | ||
design in machine learning (ML) from an information-theoretic angle. We use two | ||
main information concepts, information sufficiency (IS) and mutual information | ||
loss (MIL), to represent predictive structures in machine learning. Our first | ||
main result provides a functional expression that characterizes the class of | ||
probabilistic models consistent with an IS encoder-decoder latent predictive | ||
structure. This result formally justifies the encoder-decoder forward stages | ||
many modern ML architectures adopt to learn latent (compressed) representations | ||
for classification. To illustrate IS as a realistic and relevant model | ||
assumption, we revisit some known ML concepts and present some interesting new | ||
examples: invariant, robust, sparse, and digital models. Furthermore, our IS | ||
characterization allows us to tackle the fundamental question of how much | ||
performance (predictive expressiveness) could be lost, using the cross entropy | ||
risk, when a given encoder-decoder architecture is adopted in a learning | ||
setting. Here, our second main result shows that a mutual information loss | ||
quantifies the lack of expressiveness attributed to the choice of a (biased) | ||
encoder-decoder ML design. Finally, we address the problem of universal | ||
cross-entropy learning with an encoder-decoder design where necessary and | ||
sufficiency conditions are established to meet this requirement. In all these | ||
results, Shannon's information measures offer new interpretations and | ||
explanations for representation learning. | ||
\\ ( https://arxiv.org/abs/2405.20452 , 927kb) | ||
|
||
|
||
|
||
\\ | ||
arXiv:2405.20550 (*cross-listing*) | ||
Date: Fri, 31 May 2024 00:20:19 GMT (1009kb) | ||
|
||
Title: Uncertainty Quantification for Deep Learning | ||
Authors: Peter Jan van Leeuwen and J. Christine Chiu and C. Kevin Yang | ||
Categories: cs.LG stat.ML | ||
Comments: 25 pages 4 figures, submitted to Environmental data Science | ||
MSC-class: 62D99 | ||
ACM-class: G.3 | ||
\\ | ||
A complete and statistically consistent uncertainty quantification for deep | ||
learning is provided, including the sources of uncertainty arising from (1) the | ||
new input data, (2) the training and testing data (3) the weight vectors of the | ||
neural network, and (4) the neural network because it is not a perfect | ||
predictor. Using Bayes Theorem and conditional probability densities, we | ||
demonstrate how each uncertainty source can be systematically quantified. We | ||
also introduce a fast and practical way to incorporate and combine all sources | ||
of errors for the first time. For illustration, the new method is applied to | ||
quantify errors in cloud autoconversion rates, predicted from an artificial | ||
neural network that was trained by aircraft cloud probe measurements in the | ||
Azores and the stochastic collection equation formulated as a two-moment bin | ||
model. For this specific example, the output uncertainty arising from | ||
uncertainty in the training and testing data is dominant, followed by | ||
uncertainty in the input data, in the trained neural network, and uncertainty | ||
in the weights. We discuss the usefulness of the methodology for machine | ||
learning practice, and how, through inclusion of uncertainty in the training | ||
data, the new methodology is less sensitive to input data that falls outside of | ||
the training data set. | ||
\\ ( https://arxiv.org/abs/2405.20550 , 1009kb) | ||
|
||
|
||
\ | ||
arXiv:2405.20857 | ||
Date: Fri, 31 May 2024 14:39:46 GMT (4332kb,D) | ||
|
||
Title: Machine Learning Conservation Laws of Dynamical systems | ||
Authors: Meskerem Abebaw Mebratie and R\"udiger Nather and Guido Falk von | ||
Rudorff and Werner M. Seiler | ||
Categories: physics.comp-ph | ||
\\ | ||
Conservation laws are of great theoretical and practical interest. We | ||
describe a novel approach to machine learning conservation laws of | ||
finite-dimensional dynamical systems using trajectory data. It is the first | ||
such approach based on kernel methods instead of neural networks which leads to | ||
lower computational costs and requires a lower amount of training data. We | ||
propose the use of an "indeterminate" form of kernel ridge regression where the | ||
labels still have to be found by additional conditions. We use here a simple | ||
approach minimising the length of the coefficient vector to discover a single | ||
conservation law. | ||
\\ ( https://arxiv.org/abs/2405.20857 , 4332kb) |