Skip to content

bamos/presentations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains the slides behind my major presentations with a CC-BY license.

[2024] Amortized optimization for OT and LLMs

Powerpoint PDF

Amortized optimization methods provide fast solvers by predicting approximate solutions to optimization problems. This talk covers two recent advancements using amortization to significantly speed up the solvers of non-trivial optimization problems arising in the fields of optimal transport (OT) and large language model (LLM) attacks. Computational optimal transport problems may involve solving three nested optimization problems, each of which amortization can help with: 1) the solution map from the measures to the primal/dual OT solution (Meta OT), 2) the computation of the c-transform or Fenchel conjugate (amortized conjugates), and 3) the computation of geodesics and Lagrangian (minimum-action) paths/costs (Lagrangian OT). Adding amortization to the standard solvers in these OT settings significantly improves the runtime and deployment time of the methods. These faster amortized solutions to the Fenchel conjugate and geodesic/Lagrangian paths are of potential more general interest in other settings bottlenecked by numerical solutions to them. Beyond these optimal transport applications, we will also discuss the prompt optimization problems arising in adversarial attacks on LLMs (AdvPrompter). Here, amortization enables us to attain state-of-the-art results on the standard AdvBench dataset, that also transfer to closed-source black-box LLM APIs. The fast amortized predictions then enable us to generate a synthetic dataset of adversarial examples which an LLM can be fine-tuned on to make it more robust against jailbreaking attacks while maintaining performance.

[2024] Differentiable optimization and robotics

Powerpoint | PDF

Optimization is a crucial technology for robotics and provides functionality such as optimal control, motion planning, state estimation, alignment, manipulation, tactile sensing, pose tracking, and safety mechanisms. These solvers are often integrated with learned models that estimate and predict non-trivial parts of the world. Differentiable optimization enables the learned model to receive a learning signal from these downstream optimization problems. This signal encourages the model to improve on regions that are important for the optimization problem to work well, rather than making accurate predictions under a supervised loss. This talk will overview the foundations, applications, and recent advancements on these topics, with a focus on continuous optimal control (MPC) and non-linear least squares.

[2024] Amortized optimization and AI

Powerpoint | PDF

AI and optimization systems are widely deployed in today's computing landscape. AI systems have a remarkable capacity to make abstractions and predictions about the world while optimization systems drive decision-making, control, and robotic systems that reason and interact with the world. These technologies are already intertwined and overlapping, and optimization-based reasoning systems will continue playing a crucial role in AI systems as they continue advancing towards general intelligence. Connecting to Kahneman's modes on thought, explicitly forming and solving an optimization problem is akin to "System 2" (i.e., slow thinking), while rapidly predicting a solution to the problem can be seen as "System 1" (i.e., fast thinking). AI systems can interact with optimization solvers via a "System 2" approach by using optimization as a tool, where humans can also inject domain knowledge or safety constraints and guardrails, or via "System 1" by learning to rapidly predict (or amortize) solutions to the optimization problems.

This talk focuses on the amortization process of distilling the solutions to optimization problems into a fast, predictive model. We highlight a few recent developments in:

  1. amortizing transportation between measures (Meta Optimal Transport and Meta Flow Matching). These methods have applications in computational biology for predicting how a population of cells will be transported given an initial population and treatment.
  2. amortizing convex conjugates and Lagrangian paths, including geodesic computations. These significantly improve neural optimal transport methods repeatedly solving these subproblems, and are of broader interest anywhere repeatedly conjugating or solving path planning problems.
  3. amortizing language model prompt optimization and adversarial attacks. This setting involves repeatedly searching over the prompt space for every new prompt to jailbreak a target model, and amortization involves learning a language model that generates prompt-conditional suffixes that solve this optimization problem. Amortizing these problems attains state-of-the-art results and human-interpretable prompt modifications on the standard AdvBench settings that also transfer to closed-source black-box LLM APIs.

[2024] Lagrangian OT Poster

Powerpoint | PDF

We investigate the optimal transport problem between probability measures when the underlying cost function is understood to satisfy a least action principle, also known as a Lagrangian cost. These generalizations are useful when connecting observations from a physical system where the transport dynamics are influenced by the geometry of the system, such as obstacles (e.g., incorporating barrier functions in the Lagrangian), and allows practitioners to incorporate a priori knowledge of the underlying system such as non-Euclidean geometries (e.g., paths must be circular). Our contributions are of computational interest, where we demonstrate the ability to efficiently compute geodesics and amortize spline-based paths, which has not been done before, even in low dimensional problems. Unlike prior work, we also output the resulting Lagrangian optimal transport map without requiring an ODE solver. We demonstrate the effectiveness of our formulation on low-dimensional examples taken from prior work.

[2024] End-to-end learning geometries for graphs, dynamical systems, and regression

Powerpoint | PDF

Every machine learning setting has an underlying geometry where the data is represented and the predictions are performed in. While defaulting the geometry to a Euclidean or known manifold is capable of building powerful models, /learning/ a non-trivial geometry from data is useful for improving the overall performance and estimating unobserved structures. This talk focuses on learning geometries for:

  1. graph embeddings, where the geometry of the embedding (e.g., Euclidean, spherical, or hyperbolic) heavily influences the accuracy and distortion of the embedding depending on the graph's structure;
  2. dynamical systems, where the geometry of the state space can uncover unobserved properties of the underlying systems, e.g., geographic information such as obstacles or terrains; and
  3. regression, where the geometry of the prediction space influences where the model should be accurate or inaccurate for some downstream task.

We will focus on latent geometries in these settings that are not directly observable from the data, i.e., the geometry cannot be estimated as a submanifold of the Euclidean space the data is observed in. Instead in these settings the geometry can be shaped via a downstream signal that propagates through differentiable operations such as the geodesic distance, and log/exp maps on Riemannian manifolds. The talk covers the foundational tools here on making operations differentiable (in general via the envelope and implicit function theorems, but potentially simpler when closed-form operations are available), and demonstrates where the end-to-end learned geometry is effective.

[2023] Amortized optimization for optimal transport

Powerpoint | PDF

Optimal transport has thriving applications in machine learning, computer vision, natural language processing, the physical sciences, and economics. These applications have largely been enabled by computational breakthroughs that have lead to tractable solutions to challenging optimization problems, especially in discrete spaces through the use of convex optimization methods. Beyond these well-understood classes problems, many difficult optimization problems and sub-problems in optimal transport remain open. This talk focuses on the use of learning methods to predict, or amortize, the solutions to these optimization problems. This amortization process incurs an initial computational cost of training a model to approximately predict the solutions, but afterwards, the model can produce predictions faster than solving the optimization problems from scratch to the same level of error. Furthermore, even inaccurate predictions are tolerable because they are easily detectable, e.g., via the optimality conditions, and can be fine-tuned by warm-starting an existing method with the prediction. The talk covers how to amortize the computation at three levels: 1) the optimal transport map or potential, 2) the c-transform or convex conjugate, and 3) costs defined by a Lagrangian.

[2023] TaskMet Poster

Powerpoint | PDF

Deep learning models are often deployed in downstream tasks that the training procedure may not be aware of. For example, models solely trained to achieve accurate predictions may struggle to perform well on downstream tasks because seemingly small prediction errors may incur drastic task errors. The standard end-to-end learning approach is to make the task loss differentiable or to introduce a differentiable surrogate that the model can be trained on. In these settings, the task loss needs to be carefully balanced with the prediction loss because they may have conflicting objectives. We propose take the task loss signal one level deeper than the parameters of the model and use it to learn the parameters of the loss function the model is trained on, which can be done by learning a metric in the prediction space. This approach does not alter the optimal prediction model itself, but rather changes the model learning to emphasize the information important for the downstream task. This enables us to achieve the best of both worlds: a prediction model trained in the original prediction space while also being valuable for the desired downstream task. We validate our approach through experiments conducted in two main settings: 1) decision-focused model learning scenarios involving portfolio optimization and budget allocation, and 2) reinforcement learning in noisy environments with distracting states.

[2023] On optimal control and machine learning

Powerpoint | PDF

This talk tours the optimal control and machine learning methodologies behind recent breakthroughs in the field. These are crucial components for building agents capable of computationally modeling and interacting with our world via planning and reasoning, e.g. for robotics, aircrafts, autonomous vehicles, games, economics, finance, and language, as well as agricultural, biomedical,chemical, industrial, and mechanical systems. We will start with 1) a lightweight introduction to optimal control, and then cover 2) machine learning for optimal control --- this includes reinforcement learning and overviews how the powerful abstractive and predictive capabilities of machine learning can drastically improve every part of a control system; and 3) optimal control for machine learning --- surprisingly in this opposite direction, some machine learning problems are able to be formulated as control problems and solved with optimal control methods, e.g. parts of diffusion models, optimal transport,and optimizing the parameters of models such as large language models with reinforcement learning.

[2023] Learning with differentiable and amortized optimization

Powerpoint | PDF

Optimization has been a transformative modeling and decision-making paradigm over the past century that computationally encodes non-trivial reasoning operations. Developments in optimization foundations alongside domain experts have resulted in breakthroughs for 1) controlling robotic, autonomous, mechanical, and multi-agent systems, 2) making operational decisions based on future predictions, 3) efficiently transporting or matching resources, information, and measures, 4) allocating budgets and portfolios, 5) designing materials, molecules, and other structures, 6) solving inverse problems to infer underlying hidden costs, incentives, geometries, terrains, and other structures, and 7) learning and meta-learning the parameters of predictive and statistical models. These settings often analytically specify the relevant models of the world along with an explicit objective to optimize for. Once these are specified, computational optimization solvers are able to search over the space of possible solutions or configurations and return the best one.

The magic of optimization stops when 1) the relevant models of the world are too difficult or impossible to specify, leading to inaccurate or incomplete representations of the true setting, and 2) solving the optimization problem is computationally challenging and takes too long to return a solution on today's hardware. Machine learning methods help overcome both of these by providing fast predictive models and powerful latent abstractions of the world. In this talk, I will cover two ways of tightly integrating optimization and machine learning methods:]

  1. Differentiable optimization characterizes how the solution to an optimization problem changes as the inputs change. In machine learning settings, differentiable optimization provides an implicit layer that integrates optimization-based domain knowledge into the model and enables unknown parts of the optimization problem to be learned. I will cover the foundations of learning these layers with implicit differentiation and highlight applications in robotics and control settings.

  2. Amortized optimization rapidly predicts approximate solutions to optimization problems and is useful when repeatedly solving optimization problems. Traditional optimization methods typically solve every new problem instance from scratch, ignoring shared structures and information when solving a new instance. In contrast, a solver augmented with amortized optimization learns the shared structure present in the solution mappings and better-searches the domain. I will cover the foundations of amortized optimization and highlight new applications in control and optimal transport.

[2023] Amortized optimization

Optimization is a ubiquitous modeling tool and is often deployed in settings which repeatedly solve similar instances of the same problem. Amortized optimization methods use learning to predict the solutions to problems in these settings, exploiting the shared structure between similar problem instances. These methods have been crucial in variational inference and reinforcement learning and are capable of solving optimization problems many orders of magnitudes times faster than traditional optimization methods that do not use amortization. This talk presents an introduction to the amortized optimization foundations behind these advancements and overviews their applications in variational inference, sparse coding, gradient-based meta-learning, control, reinforcement learning, convex optimization, optimal transport, and deep equilibrium networks.

Powerpoint | PDF | paper

[2023] On amortizing convex conjugates for optimal transport

This paper focuses on computing the convex conjugate operation that arises when solving Euclidean Wasserstein-2 optimal transport problems. This conjugation, which is also referred to as the Legendre-Fenchel conjugate or c-transform,is considered difficult to compute and in practice,Wasserstein-2 methods are limited by not being able to exactly conjugate the dual potentials in continuous space. To overcome this, the computation of the conjugate can be approximated with amortized optimization, which learns a model to predict the conjugate. I show that combining amortized approximations to the conjugate with a solver for fine-tuning significantly improves the quality of transport maps learned for the Wasserstein-2 benchmark by Korotin et al. (2021a) and is able to model many 2-dimensional couplings and flows considered in the literature.

Powerpoint | PDF | paper

[2023] Continuous optimal transport

Powerpoint | PDF

[2022] Amortized optimization for computing optimal transport maps

Powerpoint | PDF

[2022] Differentiable optimization

Powerpoint | PDF

[2022] Differentiable control

Powerpoint | PDF

[2022] Amortized optimization

Powerpoint | PDF

[2021] On the model-based stochastic value gradient for continuous RL

Powerpoint | PDF | paper

[2021] Riemannian Convex Potential Maps

Keynote | PDF | paper

[2020] Differentiable cross-entropy method

Powerpoint | PDF | paper

[2019] Ph.D. Thesis: Differentiable optimization-based modeling for machine learning

Powerpoint | PDF

[2018] PyTorch libraries for linear algebra, optimization, and control

Powerpoint | PDF

[2018] OptNet, end-to-end task-based learning, and control

Powerpoint | PDF

[2018] Differentiable MPC

Powerpoint | PDF] | Poster Powerpoint | Poster PDF

[2017] OptNet

Powerpoint | PDF

[2017] ICNN

Powerpoint | PDF

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published