Create arxivlistings.do.txt

CompPhysics · Jun 3, 2024 · 885e59b · 885e59b
1 parent 517b00c
commit 885e59b
Showing 1 changed file with 113 additions and 0 deletions.
diff --git a/doc/Articles/arxivlistings.do.txt b/doc/Articles/arxivlistings.do.txt
@@ -0,0 +1,113 @@
+\\
+arXiv:2405.20435 (*cross-listing*)
+Date: Thu, 30 May 2024 19:26:51 GMT   (1563kb,D)
+
+Title: Deep Learning for Computing Convergence Rates of Markov Chains
+Authors: Yanlin Qu, Jose Blanchet, Peter Glynn
+Categories: cs.LG math.PR stat.ML
+\\
+ Convergence rate analysis for general state-space Markov chains is
+fundamentally important in areas such as Markov chain Monte Carlo and
+algorithmic analysis (for computing explicit convergence bounds). This problem,
+however, is notoriously difficult because traditional analytical methods often
+do not generate practically useful convergence bounds for realistic Markov
+chains. We propose the Deep Contractive Drift Calculator (DCDC), the first
+general-purpose sample-based algorithm for bounding the convergence of Markov
+chains to stationarity in Wasserstein distance. The DCDC has two components.
+First, inspired by the new convergence analysis framework in (Qu et.al, 2023),
+we introduce the Contractive Drift Equation (CDE), the solution of which leads
+to an explicit convergence bound. Second, we develop an efficient
+neural-network-based CDE solver. Equipped with these two components, DCDC
+solves the CDE and converts the solution into a convergence bound. We analyze
+the sample complexity of the algorithm and further demonstrate the
+effectiveness of the DCDC by generating convergence bounds for realistic Markov
+chains arising from stochastic processing networks as well as constant
+step-size stochastic optimization.
+\\ ( https://arxiv.org/abs/2405.20435 ,  1563kb)
+------------------------------------------------------------------------------
+\\
+arXiv:2405.20452 (*cross-listing*)
+Date: Thu, 30 May 2024 19:58:01 GMT   (927kb,D)
+
+Title: Understanding Encoder-Decoder Structures in Machine Learning Using
+ Information Measures
+Authors: Jorge F. Silva and Victor Faraggi and Camilo Ramirez and Alvaro Egana
+ and Eduardo Pavez
+Categories: cs.LG cs.IT math.IT stat.ML
+\\
+ We present new results to model and understand the role of encoder-decoder
+design in machine learning (ML) from an information-theoretic angle. We use two
+main information concepts, information sufficiency (IS) and mutual information
+loss (MIL), to represent predictive structures in machine learning. Our first
+main result provides a functional expression that characterizes the class of
+probabilistic models consistent with an IS encoder-decoder latent predictive
+structure. This result formally justifies the encoder-decoder forward stages
+many modern ML architectures adopt to learn latent (compressed) representations
+for classification. To illustrate IS as a realistic and relevant model
+assumption, we revisit some known ML concepts and present some interesting new
+examples: invariant, robust, sparse, and digital models. Furthermore, our IS
+characterization allows us to tackle the fundamental question of how much
+performance (predictive expressiveness) could be lost, using the cross entropy
+risk, when a given encoder-decoder architecture is adopted in a learning
+setting. Here, our second main result shows that a mutual information loss
+quantifies the lack of expressiveness attributed to the choice of a (biased)
+encoder-decoder ML design. Finally, we address the problem of universal
+cross-entropy learning with an encoder-decoder design where necessary and
+sufficiency conditions are established to meet this requirement. In all these
+results, Shannon's information measures offer new interpretations and
+explanations for representation learning.
+\\ ( https://arxiv.org/abs/2405.20452 ,  927kb)
+
+
+
+\\
+arXiv:2405.20550 (*cross-listing*)
+Date: Fri, 31 May 2024 00:20:19 GMT   (1009kb)
+
+Title: Uncertainty Quantification for Deep Learning
+Authors: Peter Jan van Leeuwen and J. Christine Chiu and C. Kevin Yang
+Categories: cs.LG stat.ML
+Comments: 25 pages 4 figures, submitted to Environmental data Science
+MSC-class: 62D99
+ACM-class: G.3
+\\
+ A complete and statistically consistent uncertainty quantification for deep
+learning is provided, including the sources of uncertainty arising from (1) the
+new input data, (2) the training and testing data (3) the weight vectors of the
+neural network, and (4) the neural network because it is not a perfect
+predictor. Using Bayes Theorem and conditional probability densities, we
+demonstrate how each uncertainty source can be systematically quantified. We
+also introduce a fast and practical way to incorporate and combine all sources
+of errors for the first time. For illustration, the new method is applied to
+quantify errors in cloud autoconversion rates, predicted from an artificial
+neural network that was trained by aircraft cloud probe measurements in the
+Azores and the stochastic collection equation formulated as a two-moment bin
+model. For this specific example, the output uncertainty arising from
+uncertainty in the training and testing data is dominant, followed by
+uncertainty in the input data, in the trained neural network, and uncertainty
+in the weights. We discuss the usefulness of the methodology for machine
+learning practice, and how, through inclusion of uncertainty in the training
+data, the new methodology is less sensitive to input data that falls outside of
+the training data set.
+\\ ( https://arxiv.org/abs/2405.20550 ,  1009kb)
+
+
+\
+arXiv:2405.20857
+Date: Fri, 31 May 2024 14:39:46 GMT   (4332kb,D)
+
+Title: Machine Learning Conservation Laws of Dynamical systems
+Authors: Meskerem Abebaw Mebratie and R\"udiger Nather and Guido Falk von
+ Rudorff and Werner M. Seiler
+Categories: physics.comp-ph
+\\
+ Conservation laws are of great theoretical and practical interest. We
+describe a novel approach to machine learning conservation laws of
+finite-dimensional dynamical systems using trajectory data. It is the first
+such approach based on kernel methods instead of neural networks which leads to
+lower computational costs and requires a lower amount of training data. We
+propose the use of an "indeterminate" form of kernel ridge regression where the
+labels still have to be found by additional conditions. We use here a simple
+approach minimising the length of the coefficient vector to discover a single
+conservation law.
+\\ ( https://arxiv.org/abs/2405.20857 ,  4332kb)