diff --git a/doc/Articles/arxivlistings.do.txt b/doc/Articles/arxivlistings.do.txt new file mode 100644 index 0000000..00dbf36 --- /dev/null +++ b/doc/Articles/arxivlistings.do.txt @@ -0,0 +1,113 @@ +\\ +arXiv:2405.20435 (*cross-listing*) +Date: Thu, 30 May 2024 19:26:51 GMT (1563kb,D) + +Title: Deep Learning for Computing Convergence Rates of Markov Chains +Authors: Yanlin Qu, Jose Blanchet, Peter Glynn +Categories: cs.LG math.PR stat.ML +\\ + Convergence rate analysis for general state-space Markov chains is +fundamentally important in areas such as Markov chain Monte Carlo and +algorithmic analysis (for computing explicit convergence bounds). This problem, +however, is notoriously difficult because traditional analytical methods often +do not generate practically useful convergence bounds for realistic Markov +chains. We propose the Deep Contractive Drift Calculator (DCDC), the first +general-purpose sample-based algorithm for bounding the convergence of Markov +chains to stationarity in Wasserstein distance. The DCDC has two components. +First, inspired by the new convergence analysis framework in (Qu et.al, 2023), +we introduce the Contractive Drift Equation (CDE), the solution of which leads +to an explicit convergence bound. Second, we develop an efficient +neural-network-based CDE solver. Equipped with these two components, DCDC +solves the CDE and converts the solution into a convergence bound. We analyze +the sample complexity of the algorithm and further demonstrate the +effectiveness of the DCDC by generating convergence bounds for realistic Markov +chains arising from stochastic processing networks as well as constant +step-size stochastic optimization. +\\ ( https://arxiv.org/abs/2405.20435 , 1563kb) +------------------------------------------------------------------------------ +\\ +arXiv:2405.20452 (*cross-listing*) +Date: Thu, 30 May 2024 19:58:01 GMT (927kb,D) + +Title: Understanding Encoder-Decoder Structures in Machine Learning Using + Information Measures +Authors: Jorge F. Silva and Victor Faraggi and Camilo Ramirez and Alvaro Egana + and Eduardo Pavez +Categories: cs.LG cs.IT math.IT stat.ML +\\ + We present new results to model and understand the role of encoder-decoder +design in machine learning (ML) from an information-theoretic angle. We use two +main information concepts, information sufficiency (IS) and mutual information +loss (MIL), to represent predictive structures in machine learning. Our first +main result provides a functional expression that characterizes the class of +probabilistic models consistent with an IS encoder-decoder latent predictive +structure. This result formally justifies the encoder-decoder forward stages +many modern ML architectures adopt to learn latent (compressed) representations +for classification. To illustrate IS as a realistic and relevant model +assumption, we revisit some known ML concepts and present some interesting new +examples: invariant, robust, sparse, and digital models. Furthermore, our IS +characterization allows us to tackle the fundamental question of how much +performance (predictive expressiveness) could be lost, using the cross entropy +risk, when a given encoder-decoder architecture is adopted in a learning +setting. Here, our second main result shows that a mutual information loss +quantifies the lack of expressiveness attributed to the choice of a (biased) +encoder-decoder ML design. Finally, we address the problem of universal +cross-entropy learning with an encoder-decoder design where necessary and +sufficiency conditions are established to meet this requirement. In all these +results, Shannon's information measures offer new interpretations and +explanations for representation learning. +\\ ( https://arxiv.org/abs/2405.20452 , 927kb) + + + +\\ +arXiv:2405.20550 (*cross-listing*) +Date: Fri, 31 May 2024 00:20:19 GMT (1009kb) + +Title: Uncertainty Quantification for Deep Learning +Authors: Peter Jan van Leeuwen and J. Christine Chiu and C. Kevin Yang +Categories: cs.LG stat.ML +Comments: 25 pages 4 figures, submitted to Environmental data Science +MSC-class: 62D99 +ACM-class: G.3 +\\ + A complete and statistically consistent uncertainty quantification for deep +learning is provided, including the sources of uncertainty arising from (1) the +new input data, (2) the training and testing data (3) the weight vectors of the +neural network, and (4) the neural network because it is not a perfect +predictor. Using Bayes Theorem and conditional probability densities, we +demonstrate how each uncertainty source can be systematically quantified. We +also introduce a fast and practical way to incorporate and combine all sources +of errors for the first time. For illustration, the new method is applied to +quantify errors in cloud autoconversion rates, predicted from an artificial +neural network that was trained by aircraft cloud probe measurements in the +Azores and the stochastic collection equation formulated as a two-moment bin +model. For this specific example, the output uncertainty arising from +uncertainty in the training and testing data is dominant, followed by +uncertainty in the input data, in the trained neural network, and uncertainty +in the weights. We discuss the usefulness of the methodology for machine +learning practice, and how, through inclusion of uncertainty in the training +data, the new methodology is less sensitive to input data that falls outside of +the training data set. +\\ ( https://arxiv.org/abs/2405.20550 , 1009kb) + + +\ +arXiv:2405.20857 +Date: Fri, 31 May 2024 14:39:46 GMT (4332kb,D) + +Title: Machine Learning Conservation Laws of Dynamical systems +Authors: Meskerem Abebaw Mebratie and R\"udiger Nather and Guido Falk von + Rudorff and Werner M. Seiler +Categories: physics.comp-ph +\\ + Conservation laws are of great theoretical and practical interest. We +describe a novel approach to machine learning conservation laws of +finite-dimensional dynamical systems using trajectory data. It is the first +such approach based on kernel methods instead of neural networks which leads to +lower computational costs and requires a lower amount of training data. We +propose the use of an "indeterminate" form of kernel ridge regression where the +labels still have to be found by additional conditions. We use here a simple +approach minimising the length of the coefficient vector to discover a single +conservation law. +\\ ( https://arxiv.org/abs/2405.20857 , 4332kb)