-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.json
15 lines (1 loc) · 29.2 KB
/
index.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[{"authors":null,"categories":null,"content":"Hi! I am a Quantitative Researcher at Two Sigma. I earned my PhD in Computer and Information Science from the University of Pennsylvania. I was advised by Prof. Rajeev Alur. My research interests lie at the intersection of Formal Methods and Machine Learning. In particular, I am interested in Neurosymbolic Programming, Reinforcement Learning and Interpretable Machine Learning.\n","date":1685577600,"expirydate":-62135596800,"kind":"term","lang":"en","lastmod":1685667525,"objectID":"2525497d367e79493fd32b198b28f040","permalink":"","publishdate":"0001-01-01T00:00:00Z","relpermalink":"","section":"authors","summary":"Hi! I am a Quantitative Researcher at Two Sigma. I earned my PhD in Computer and Information Science from the University of Pennsylvania. I was advised by Prof. Rajeev Alur. My research interests lie at the intersection of Formal Methods and Machine Learning.","tags":null,"title":"Kishor Jothimurugan","type":"authors"},{"authors":[],"categories":null,"content":" Click on the Slides button above to view the built-in slides feature. Slides can be added in a few ways:\nCreate slides using Wowchemy’s Slides feature and link using slides parameter in the front matter of the talk file Upload an existing slide deck to static/ and link using url_slides parameter in the front matter of the talk file Embed your slides (e.g. Google Slides) or presentation video on this page using shortcodes. Further event details, including page elements such as image galleries, can be added to the body of this page.\n","date":1906549200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1906549200,"objectID":"a8edef490afe42206247b6ac05657af0","permalink":"https://keyshor.github.io/talk/example-talk/","publishdate":"2017-01-01T00:00:00Z","relpermalink":"/talk/example-talk/","section":"event","summary":"An example talk using Wowchemy's Markdown slides feature.","tags":[],"title":"Example Talk","type":"event"},{"authors":["Kishor Jothimurugan","Steve Hsu","Osbert Bastani","Rajeev Alur"],"categories":[],"content":"","date":1685577600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1685667525,"objectID":"c57141860ccb170f41d132846ec7405d","permalink":"https://keyshor.github.io/publication/jothimurugan-robust-2023/","publishdate":"2023-06-01T00:58:45.115288Z","relpermalink":"/publication/jothimurugan-robust-2023/","section":"publication","summary":"Compositional reinforcement learning is a promising approach for training policies to perform complex long-horizon tasks. Typically, a high-level task is decomposed into a sequence of subtasks and a separate policy is trained to perform each subtask. In this paper, we focus on the problem of training subtask policies in a way that they can be used to perform any task; here, a task is given by a sequence of subtasks. We aim to maximize the worst-case performance over all tasks as opposed to the average-case performance. We formulate the problem as a two agent zero-sum game in which the adversary picks the sequence of subtasks. We propose two RL algorithms to solve this game - one is an adaptation of existing multi-agent RL algorithms to our setting and the other is an asynchronous version which enables parallel training of subtask policies. We evaluate our approach on two multi-task environments with continuous states and actions and demonstrate that our algorithms outperform state-of-the-art baselines.","tags":["compositional learning","options","zero-shot generalization"],"title":"Robust Subtask Leaning for Compositional Generalization","type":"publication"},{"authors":["Rajeev Alur","Osbert Bastani","Kishor Jothimurugan","Mateo Perez","Fabio Somenzi","Ashutosh Trivedi"],"categories":[],"content":"","date":1684108800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1684198724,"objectID":"66cfc4fad6877bf64fcac30e1e20d9e9","permalink":"https://keyshor.github.io/publication/alur-policy-2023/","publishdate":"2023-05-15T00:58:44.736816Z","relpermalink":"/publication/alur-policy-2023/","section":"publication","summary":"The difficulty of manually specifying reward functions has led to an interest in using linear temporal logic (LTL) to express objectives for reinforcement learning (RL). However, LTL has the downside that it is sensitive to small perturbations in the transition probabilities, which prevents probably approximately correct (PAC) learning without additional assumptions. Time discounting provides a way of removing this sensitivity, while retaining the high expressivity of the logic. We study the use of discounted LTL for policy synthesis in Markov decision processes with unknown transition probabilities, and show how to reduce discounted LTL to discounted-sum reward via a reward machine when all discount factors are identical.","tags":[],"title":"Policy Synthesis and Reinforcement Learning for Discounted LTL","type":"publication"},{"authors":null,"categories":null,"content":"The unprecedented proliferation of data-driven approaches, especially machine learning, has put the spotlight on building trustworthy AI through the combination of logical reasoning and machine learning. Reinforcement Learning from Logical Specifications is one such topic where formal logical constructs are utilized to overcome challenges faced by modern RL algorithms. Research on this topic is scattered across venues targeting subareas of AI. Foundational work has appeared at formal methods and AI venues. Algorithmic development and applications have appeared at machine learning, robotics, and cyber-physical systems venues. This tutorial consolidates recent progress in one capsule for a typical AI researcher. The tutorial is designed to explain the importance of using formal specifications in RL and encourage researchers to apply existing techniques for RL from logical specifications as well as contribute to the growing body of work on this topic.\nIn this tutorial, we introduce reinforcement learning as a tool for automated synthesis of control policies and discuss the challenge of encoding long-horizon tasks using rewards. We then formulate the problem of reinforcement learning from logical specifications and present recent progress in developing scalable algorithms as well as theoretical results demonstrating the hardness of learning in this context.\nThe syllabus of this tutorial can be found in the AAAI proposal.\nPresenters Rajeev Alur Suguman Bansal Osbert Bastani Kishor Jothimurugan Reading Material The tutorial is organized into three modules. Reading material corresponding to these modules as well as additional resources are provided below.\nIntroduction. We introduce reinforcement learning and motivation behind the use of logical specifications. We discuss two specification languages: LTL and SpectRL.\nFirst three sections of the paper on specifications in reinforcement learning Notes on Linear Temporal Logic (LTL) First two sections of the paper on SpectRL Practical Algorithms. We discuss two learning algorithms: one for LTL specs that is based on reward machines and a compositional algorithm for SpectRL specifications.\nPaper on reward machines Paper on generating reward machines from LTL specs Paper on a compositional RL algorithm for SpectRL specs Theoretical Results. We discuss hardness results regarding learning from logical specifications as well as a reward generation procedure for LTL specifications that has a weak optimality preservation guarantee.\nSections 4, 5 and 6 of the paper on specifications in reinforcement learning Paper on faithful reward generation from LTL specs Paper on good-for-MDP automata Additional Resources. Though not presented in the tutorial, the following material is provided for those interested in exploring further.\nPaper characterizing the exact class of LTL specs for which PAC learning is possible Paper providing an alternate approach for generating optimality preserving rewards from LTL specs Paper on multi-agent reinforcement learning from SpectRL specifications ","date":1675728000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1675728000,"objectID":"573ec768e9f179272a2be18261d6dbf7","permalink":"https://keyshor.github.io/teaching/aaai_tutorial/","publishdate":"2023-02-07T00:00:00Z","relpermalink":"/teaching/aaai_tutorial/","section":"teaching","summary":"The unprecedented proliferation of data-driven approaches, especially machine learning, has put the spotlight on building trustworthy AI through the combination of logical reasoning and machine learning. Reinforcement Learning from Logical Specifications is one such topic where formal logical constructs are utilized to overcome challenges faced by modern RL algorithms.","tags":null,"title":"AAAI Tutorial on Specification-Guided Reinforcement Learning","type":"teaching"},{"authors":null,"categories":null,"content":"Wowchemy is designed to give technical content creators a seamless experience. You can focus on the content and Wowchemy handles the rest.\nHighlight your code snippets, take notes on math classes, and draw diagrams from textual representation.\nOn this page, you’ll find some examples of the types of technical content that can be rendered with Wowchemy.\nExamples Code Wowchemy supports a Markdown extension for highlighting code syntax. You can customize the styles under the syntax_highlighter option in your config/_default/params.yaml file.\n```python import pandas as pd data = pd.read_csv(\u0026#34;data.csv\u0026#34;) data.head() ``` renders as\nimport pandas as pd data = pd.read_csv(\u0026#34;data.csv\u0026#34;) data.head() Mindmaps Wowchemy supports a Markdown extension for mindmaps.\nSimply insert a Markdown markmap code block and optionally set the height of the mindmap as shown in the example below.\nA simple mindmap defined as a Markdown list:\n```markmap {height=\u0026#34;200px\u0026#34;} - Hugo Modules - wowchemy - wowchemy-plugins-netlify - wowchemy-plugins-netlify-cms - wowchemy-plugins-reveal ``` renders as\n- Hugo Modules - wowchemy - wowchemy-plugins-netlify - wowchemy-plugins-netlify-cms - wowchemy-plugins-reveal A more advanced mindmap with formatting, code blocks, and math:\n```markmap - Mindmaps - Links - [Wowchemy Docs](https://wowchemy.com/docs/) - [Discord Community](https://discord.gg/z8wNYzb) - [GitHub](https://github.com/wowchemy/wowchemy-hugo-themes) - Features - Markdown formatting - **inline** ~~text~~ *styles* - multiline text - `inline code` - ```js console.log(\u0026#39;hello\u0026#39;); console.log(\u0026#39;code block\u0026#39;); ``` - Math: $x = {-b \\pm \\sqrt{b^2-4ac} \\over 2a}$ ``` renders as\n- Mindmaps - Links - [Wowchemy Docs](https://wowchemy.com/docs/) - [Discord Community](https://discord.gg/z8wNYzb) - [GitHub](https://github.com/wowchemy/wowchemy-hugo-themes) - Features - Markdown formatting - **inline** ~~text~~ *styles* - multiline text - `inline code` - ```js console.log(\u0026#39;hello\u0026#39;); console.log(\u0026#39;code block\u0026#39;); ``` - Math: $x = {-b \\pm \\sqrt{b^2-4ac} \\over 2a}$ Charts Wowchemy supports the popular Plotly format for interactive charts.\nSave your Plotly JSON in your page folder, for example line-chart.json, and then add the {{\u0026lt; chart data=\u0026#34;line-chart\u0026#34; \u0026gt;}} shortcode where you would like the chart to appear.\nDemo:\nYou might also find the Plotly JSON Editor useful.\nMath Wowchemy supports a Markdown extension for $\\LaTeX$ math. You can enable this feature by toggling the math option in your config/_default/params.yaml file.\nTo render inline or block math, wrap your LaTeX math with {{\u0026lt; math \u0026gt;}}$...${{\u0026lt; /math \u0026gt;}} or {{\u0026lt; math \u0026gt;}}$$...$${{\u0026lt; /math \u0026gt;}}, respectively. (We wrap the LaTeX math in the Wowchemy math shortcode to prevent Hugo rendering our math as Markdown. The math shortcode is new in v5.5-dev.)\nExample math block:\n{{\u0026lt; math \u0026gt;}} $$ \\gamma_{n} = \\frac{ \\left | \\left (\\mathbf x_{n} - \\mathbf x_{n-1} \\right )^T \\left [\\nabla F (\\mathbf x_{n}) - \\nabla F (\\mathbf x_{n-1}) \\right ] \\right |}{\\left \\|\\nabla F(\\mathbf{x}_{n}) - \\nabla F(\\mathbf{x}_{n-1}) \\right \\|^2} $$ {{\u0026lt; /math \u0026gt;}} renders as\n$$\\gamma_{n} = \\frac{ \\left | \\left (\\mathbf x_{n} - \\mathbf x_{n-1} \\right )^T \\left [\\nabla F (\\mathbf x_{n}) - \\nabla F (\\mathbf x_{n-1}) \\right ] \\right |}{\\left \\|\\nabla F(\\mathbf{x}_{n}) - \\nabla F(\\mathbf{x}_{n-1}) \\right \\|^2}$$ Example inline math {{\u0026lt; math \u0026gt;}}$\\nabla F(\\mathbf{x}_{n})${{\u0026lt; /math \u0026gt;}} renders as $\\nabla F(\\mathbf{x}_{n})$.\nExample multi-line math using the math linebreak (\\\\):\n{{\u0026lt; math \u0026gt;}} $$f(k;p_{0}^{*}) = \\begin{cases}p_{0}^{*} \u0026amp; \\text{if }k=1, \\\\ 1-p_{0}^{*} \u0026amp; \\text{if }k=0.\\end{cases}$$ {{\u0026lt; /math \u0026gt;}} renders as\n$$ f(k;p_{0}^{*}) = \\begin{cases}p_{0}^{*} \u0026amp; \\text{if }k=1, \\\\ 1-p_{0}^{*} \u0026amp; \\text{if }k=0.\\end{cases} $$ Diagrams Wowchemy supports a Markdown extension for diagrams. You can enable this feature by toggling the diagram option in your config/_default/params.toml file or by adding diagram: true to your page front matter.\nAn example flowchart:\n```mermaid graph TD A[Hard] --\u0026gt;|Text| B(Round) B --\u0026gt; C{Decision} C --\u0026gt;|One| D[Result 1] C --\u0026gt;|Two| E[Result 2] ``` renders as\ngraph TD A[Hard] --\u0026gt;|Text| B(Round) B --\u0026gt; C{Decision} C --\u0026gt;|One| D[Result 1] C --\u0026gt;|Two| E[Result 2] An example sequence diagram:\n```mermaid sequenceDiagram Alice-\u0026gt;\u0026gt;John: Hello John, how are you? loop Healthcheck John-\u0026gt;\u0026gt;John: Fight against hypochondria end Note right of John: Rational thoughts! John--\u0026gt;\u0026gt;Alice: Great! John-\u0026gt;\u0026gt;Bob: How about you? Bob--\u0026gt;\u0026gt;John: Jolly good! ``` renders as\nsequenceDiagram Alice-\u0026gt;\u0026gt;John: Hello John, how are you? loop Healthcheck John-\u0026gt;\u0026gt;John: Fight against hypochondria end Note right of John: Rational thoughts! John--\u0026gt;\u0026gt;Alice: Great! John-\u0026gt;\u0026gt;Bob: How about you? Bob--\u0026gt;\u0026gt;John: Jolly good! An example Gantt diagram:\n```mermaid gantt section Section Completed :done, des1, 2014-01-06,2014-01-08 Active :active, des2, 2014-01-07, 3d Parallel 1 : des3, after des1, 1d Parallel 2 : des4, after des1, 1d Parallel 3 : des5, after des3, 1d Parallel 4 : des6, after des4, 1d ``` renders …","date":1668038400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1668038400,"objectID":"24cf96050dd28883a7d6c2207f435f2a","permalink":"https://keyshor.github.io/random/wowchemy_syntax/","publishdate":"2022-11-10T00:00:00Z","relpermalink":"/random/wowchemy_syntax/","section":"random","summary":"Wowchemy is designed to give technical content creators a seamless experience. You can focus on the content and Wowchemy handles the rest.\nHighlight your code snippets, take notes on math classes, and draw diagrams from textual representation.","tags":null,"title":"How to Wrtie Wowchemy Posts","type":"random"},{"authors":["Rajeev Alur","Suguman Bansal","Osbert Bastani","Kishor Jothimurugan"],"categories":[],"content":"","date":1659312000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1665017924,"objectID":"94a91a8b1f3303f5051db3d3330c823f","permalink":"https://keyshor.github.io/publication/alur-framework-2021/","publishdate":"2022-10-06T00:58:44.736816Z","relpermalink":"/publication/alur-framework-2021/","section":"publication","summary":"Reactive synthesis algorithms allow automatic construction of policies to control an environment modeled as a Markov Decision Process (MDP) that are optimal with respect to high-level temporal logic specifications. However, they assume that the MDP model is known a priori. Reinforcement Learning (RL) algorithms, in contrast, are designed to learn an optimal policy when the transition probabilities of the MDP are unknown, but require the user to associate local rewards with transitions. The appeal of high-level temporal logic specifications has motivated research to develop RL algorithms for synthesis of policies from specifications. To understand the techniques, and nuanced variations in their theoretical guarantees, in the growing body of resulting literature, we develop a formal framework for defining transformations among RL tasks with different forms of objectives. We define the notion of a sampling-based reduction to transform a given MDP into another one which can be simulated even when the transition probabilities of the original MDP are unknown. We formalize the notions of preservation of optimal policies, convergence, and robustness of such reductions. We then use our framework to restate known results, establish new results to fill in some gaps, and identify open problems. In particular, we show that certain kinds of reductions from LTL specifications to reward-based ones do not exist, and prove the non-existence of RL algorithms with PAC-MDP guarantees for safety specifications.","tags":[],"title":"A Framework for Transforming Specifications in Reinforcement Learning","type":"publication"},{"authors":["Kishor Jothimurugan","Suguman Bansal","Osbert Bastani","Rajeev Alur"],"categories":[],"content":"","date":1659312000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1665017925,"objectID":"677bb95f7b71f751cdbf9acd1ea86441","permalink":"https://keyshor.github.io/publication/jothimurugan-specification-guided-2022/","publishdate":"2022-10-06T00:58:44.92654Z","relpermalink":"/publication/jothimurugan-specification-guided-2022/","section":"publication","summary":"Reinforcement learning has been shown to be an effective strategy for automatically training policies for challenging control problems. Focusing on non-cooperative multi-agent systems, we propose a novel reinforcement learning framework for training joint policies that form a Nash equilibrium. In our approach, rather than providing low-level reward functions, the user provides high-level specifications that encode the objective of each agent. Then, guided by the structure of the specifications, our algorithm searches over policies to identify one that provably forms an $$epsilon $$ϵ-Nash equilibrium (with high probability). Importantly, it prioritizes policies in a way that maximizes social welfare across all agents. Our empirical evaluation demonstrates that our algorithm computes equilibrium policies with high social welfare, whereas state-of-the-art baselines either fail to compute Nash equilibria or compute ones with comparatively lower social welfare.","tags":[],"title":"Specification-Guided Learning of Nash Equilibria with High Social Welfare","type":"publication"},{"authors":["Kishor Jothimurugan","Suguman Bansal","Osbert Bastani","Rajeev Alur"],"categories":[],"content":"","date":1638316800,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1665017924,"objectID":"707a6d65e86464265dd7f8bad29ea931","permalink":"https://keyshor.github.io/publication/jothimurugan-compositional-2021/","publishdate":"2022-10-06T00:58:43.768815Z","relpermalink":"/publication/jothimurugan-compositional-2021/","section":"publication","summary":"We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning. In this work, we develop a compositional learning approach, called DIRL, that interleaves high-level planning and reinforcement learning. First, DIRL encodes the specification as an abstract graph; intuitively, vertices and edges of the graph correspond to regions of the state space and simpler sub-tasks, respectively. Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph. An evaluation of the proposed approach on a set of challenging control benchmarks with continuous state and action spaces demonstrates that it outperforms state-of-the-art baselines.","tags":[],"title":"Compositional Reinforcement Learning from Logical Specifications","type":"publication"},{"authors":["Radoslav Ivanov","Kishor Jothimurugan","Steve Hsu","Shaan Vaidya","Rajeev Alur","Osbert Bastani"],"categories":[],"content":"","date":1627776000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1665017924,"objectID":"6d0485e349148c38a5514fb82def0624","permalink":"https://keyshor.github.io/publication/ivanov-compositional-2021/","publishdate":"2022-10-06T00:58:44.360313Z","relpermalink":"/publication/ivanov-compositional-2021/","section":"publication","summary":"Recent advances in deep learning have enabled data-driven controller design for autonomous systems. However, verifying safety of such controllers, which are often hard-to-analyze neural networks, remains a challenge. Inspired by compositional strategies for program verification, we propose a framework for compositional learning and verification of neural network controllers. Our approach is to decompose the task (e.g., car navigation) into a sequence of subtasks (e.g., segments of the track), each corresponding to a different mode of the system (e.g., go straight or turn). Then, we learn a separate controller for each mode, and verify correctness by proving that (i) each controller is correct within its mode, and (ii) transitions between modes are correct. This compositional strategy not only improves scalability of both learning and verification, but also enables our approach to verify correctness for arbitrary compositions of the subtasks. To handle partial observability (e.g., LiDAR), we additionally learn and verify a mode predictor that predicts which controller to use. Finally, our framework also incorporates an algorithm that, given a set of controllers, automatically synthesizes the pre- and postconditions required by our verification procedure. We validate our approach in a case study on a simulation model of the F1/10 autonomous car, a system that poses challenges for existing verification tools due to both its reliance on LiDAR observations, as well as the need to prove safety for complex track geometries. We leverage our framework to learn and verify a controller that safely completes any track consisting of an arbitrary sequence of five kinds of track segments.","tags":["compositional reasoning","neural networks","verification"],"title":"Compositional Learning and Verification of Neural Network Controllers","type":"publication"},{"authors":["Kishor Jothimurugan","Osbert Bastani","Rajeev Alur"],"categories":[],"content":"","date":1617235200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1665017924,"objectID":"062e9956a5acb4f7ba496bea18726254","permalink":"https://keyshor.github.io/publication/jothimurugan-abstract-2021/","publishdate":"2022-10-06T00:58:44.55044Z","relpermalink":"/publication/jothimurugan-abstract-2021/","section":"publication","summary":"We propose a novel hierarchical reinforcement learning framework for control with continuous state and action spaces. In our framework, the user specifies subgoal regions which are subsets of states; then, we (i) learn options that serve as transitions between these subgoal regions, and (ii) construct a high-level plan in the resulting abstract decision process (ADP). A key challenge is that the ADP may not be Markov; we propose two algorithms for planning in the ADP that address this issue. Our first algorithm is conservative, allowing us to prove theoretical guarantees on its performance, which help inform the design of subgoal regions. Our second algorithm is a practical one that interweaves planning at the abstract level and learning at the concrete level. In our experiments, we demonstrate that our approach outperforms state-of-the-art hierarchical reinforcement learning algorithms on several challenging benchmarks.","tags":[],"title":"Abstract Value Iteration for Hierarchical Reinforcement Learning","type":"publication"},{"authors":["Kishor Jothimurugan","Matthew Andrews","Jeongran Lee","Lorenzo Maggi"],"categories":[],"content":"Abstract We study regenerative stopping problems in which the system starts anew whenever the controller decides to stop and the long-term average cost is to be minimized. Traditional model-based solutions involve estimating the underlying process from data and computing strategies for the estimated model. In this paper, we compare such solutions to deep reinforcement learning and imitation learning which involve learning a neural network policy from simulations. We evaluate the different approaches on a real-world problem of shipping consolidation in logistics and demonstrate that deep learning can be effectively used to solve such problems.\n","date":1598918400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1598918400,"objectID":"12f1c5550eeab6b9960630b5eb10fe95","permalink":"https://keyshor.github.io/preprints/bell-labs/","publishdate":"2020-09-01T00:00:00Z","relpermalink":"/preprints/bell-labs/","section":"preprints","summary":"Abstract We study regenerative stopping problems in which the system starts anew whenever the controller decides to stop and the long-term average cost is to be minimized. Traditional model-based solutions involve estimating the underlying process from data and computing strategies for the estimated model.","tags":[],"title":"Learning Algorithms for Regenerative Stopping Problems with Applications to Shipping Consolidation in Logistics","type":"preprints"},{"authors":["Rajeev Alur","Yu Chen","Kishor Jothimurugan","Sanjeev Khanna"],"categories":[],"content":"","date":1593561600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1665017925,"objectID":"a57ac735f35e312b46674633f45ee7d8","permalink":"https://keyshor.github.io/publication/alur-space-efficient-2020/","publishdate":"2022-10-06T00:58:45.115288Z","relpermalink":"/publication/alur-space-efficient-2020/","section":"publication","summary":"Real-time decision making in IoT applications relies upon space-efficient evaluation of queries over streaming data. To model the uncertainty in the classification of data being processed, we consider the model of probabilistic strings --- sequences of discrete probability distributions over a finite set of events, and initiate the study of space complexity of streaming computation for different classes of queries over such probabilistic strings. We first consider the problem of computing the probability that a word, sampled from the distribution defined by the probabilistic string read so far, is accepted by a given deterministic finite automaton. We show that this regular pattern matching problem can be solved using space that is only poly-logarithmic in the string length (and polynomial in the size of the DFA) if we are allowed a multiplicative approximation error. Then we show how to generalize this result to quantitative queries specified by additive cost register automata --- these are automata that map strings to numerical values using finite control and registers that get updated using linear transformations. Finally, we consider the case when updates in such an automaton involve tests, and in particular, when there is a counter variable that can be either incremented or decremented but decrements only apply when the counter value is non-zero. In this case, the desired answer depends on the probability distribution over the set of possible counter values that can range from 0 to n for a string of length n. Under a mild assumption, namely probabilities of the individual events are bounded away from 0 and 1, we show that there is an algorithm that can compute all n entries of this probability distribution vector to within additive 1/poly(n) error using space that is only Õ(n). In establishing these results, we introduce several new technical ideas that may prove useful for designing space-efficient algorithms for other query models over probabilistic strings.","tags":["probabilistic streams","query processing over streams","streaming algorithms"],"title":"Space-efficient Query Evaluation over Probabilistic Event Streams","type":"publication"},{"authors":["Kishor Jothimurugan","Rajeev Alur","Osbert Bastani"],"categories":[],"content":"","date":1575158400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1665017924,"objectID":"91caff25258f07e215b2a6bcc29f6582","permalink":"https://keyshor.github.io/publication/jothimurugan-composable-2019/","publishdate":"2022-10-06T00:58:44.166762Z","relpermalink":"/publication/jothimurugan-composable-2019/","section":"publication","summary":"Reinforcement learning is a promising approach for learning control policies for robot tasks. However, specifying complex tasks (e.g., with multiple objectives and safety constraints) can be challenging, since the user must design a reward function that encodes the entire task. Furthermore, the user often needs to manually shape the reward to ensure convergence of the learning algorithm. We propose a language for specifying complex control tasks, along with an algorithm that compiles specifications in our language into a reward function and automatically performs reward shaping. We implement our approach in a tool called SPECTRL, and show that it outperforms several state-of-the-art baselines.","tags":[],"title":"A Composable Specification Language for Reinforcement Learning Tasks","type":"publication"}]