Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New submissions for Fri, 21 Apr 23 #38

Open
A-suozhang opened this issue Apr 22, 2023 · 0 comments
Open

New submissions for Fri, 21 Apr 23 #38

A-suozhang opened this issue Apr 22, 2023 · 0 comments

Comments

@A-suozhang
Copy link
Owner

Keyword: efficient

Evolving Constrained Reinforcement Learning Policy

  • Authors: Chengpeng Hu, Jiyuan Pei, Jialin Liu, Xin Yao
  • Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.09869
  • Pdf link: https://arxiv.org/pdf/2304.09869
  • Abstract
    Evolutionary algorithms have been used to evolve a population of actors to generate diverse experiences for training reinforcement learning agents, which helps to tackle the temporal credit assignment problem and improves the exploration efficiency. However, when adapting this approach to address constrained problems, balancing the trade-off between the reward and constraint violation is hard. In this paper, we propose a novel evolutionary constrained reinforcement learning (ECRL) algorithm, which adaptively balances the reward and constraint violation with stochastic ranking, and at the same time, restricts the policy's behaviour by maintaining a set of Lagrange relaxation coefficients with a constraint buffer. Extensive experiments on robotic control benchmarks show that our ECRL achieves outstanding performance compared to state-of-the-art algorithms. Ablation analysis shows the benefits of introducing stochastic ranking and constraint buffer.

GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models

  • Authors: Li Zaitang, Pin-Yu Chen, Tsung-Yi Ho
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.09875
  • Pdf link: https://arxiv.org/pdf/2304.09875
  • Abstract
    Current studies on adversarial robustness mainly focus on aggregating local robustness results from a set of data samples to evaluate and rank different models. However, the local statistics may not well represent the true global robustness of the underlying unknown data distribution. To address this challenge, this paper makes the first attempt to present a new framework, called GREAT Score , for global robustness evaluation of adversarial perturbation using generative models. Formally, GREAT Score carries the physical meaning of a global statistic capturing a mean certified attack-proof perturbation level over all samples drawn from a generative model. For finite-sample evaluation, we also derive a probabilistic guarantee on the sample complexity and the difference between the sample mean and the true mean. GREAT Score has several advantages: (1) Robustness evaluations using GREAT Score are efficient and scalable to large models, by sparing the need of running adversarial attacks. In particular, we show high correlation and significantly reduced computation cost of GREAT Score when compared to the attack-based model ranking on RobustBench (Croce,et. al. 2021). (2) The use of generative models facilitates the approximation of the unknown data distribution. In our ablation study with different generative adversarial networks (GANs), we observe consistency between global robustness evaluation and the quality of GANs. (3) GREAT Score can be used for remote auditing of privacy-sensitive black-box models, as demonstrated by our robustness evaluation on several online facial recognition services.

The eBible Corpus: Data and Model Benchmarks for Bible Translation for Low-Resource Languages

  • Authors: Vesa Akerman, David Baines, Damien Daspit, Ulf Hermjakob, Taeho Jang, Colin Leong, Michael Martin, Joel Mathew, Jonathan Robie, Marcus Schwarting
  • Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.09919
  • Pdf link: https://arxiv.org/pdf/2304.09919
  • Abstract
    Efficiently and accurately translating a corpus into a low-resource language remains a challenge, regardless of the strategies employed, whether manual, automated, or a combination of the two. Many Christian organizations are dedicated to the task of translating the Holy Bible into languages that lack a modern translation. Bible translation (BT) work is currently underway for over 3000 extremely low resource languages. We introduce the eBible corpus: a dataset containing 1009 translations of portions of the Bible with data in 833 different languages across 75 language families. In addition to a BT benchmarking dataset, we introduce model performance benchmarks built on the No Language Left Behind (NLLB) neural machine translation (NMT) models. Finally, we describe several problems specific to the domain of BT and consider how the established data and model benchmarks might be used for future translation efforts. For a BT task trained with NLLB, Austronesian and Trans-New Guinea language families achieve 35.1 and 31.6 BLEU scores respectively, which spurs future innovations for NMT for low-resource languages in Papua New Guinea.

A robust and interpretable deep learning framework for multi-modal registration via keypoints

  • Authors: Alan Q. Wang, Evan M. Yu, Adrian V. Dalca, Mert R. Sabuncu
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.09941
  • Pdf link: https://arxiv.org/pdf/2304.09941
  • Abstract
    We present KeyMorph, a deep learning-based image registration framework that relies on automatically detecting corresponding keypoints. State-of-the-art deep learning methods for registration often are not robust to large misalignments, are not interpretable, and do not incorporate the symmetries of the problem. In addition, most models produce only a single prediction at test-time. Our core insight which addresses these shortcomings is that corresponding keypoints between images can be used to obtain the optimal transformation via a differentiable closed-form expression. We use this observation to drive the end-to-end learning of keypoints tailored for the registration task, and without knowledge of ground-truth keypoints. This framework not only leads to substantially more robust registration but also yields better interpretability, since the keypoints reveal which parts of the image are driving the final alignment. Moreover, KeyMorph can be designed to be equivariant under image translations and/or symmetric with respect to the input image ordering. Finally, we show how multiple deformation fields can be computed efficiently and in closed-form at test time corresponding to different transformation variants. We demonstrate the proposed framework in solving 3D affine and spline-based registration of multi-modal brain MRI scans. In particular, we show registration accuracy that surpasses current state-of-the-art methods, especially in the context of large displacements. Our code is available at https://github.com/evanmy/keymorph.

Baugh-Wooley Multiplication for the RISCV Processor

  • Authors: Franc Grootjen, Nikolai Schauer
  • Subjects: Hardware Architecture (cs.AR)
  • Arxiv link: https://arxiv.org/abs/2304.09952
  • Pdf link: https://arxiv.org/pdf/2304.09952
  • Abstract
    This article describes an efficient way to implement the multiplication instructions for a RISCV processor. Instead of using three predefined IP blocks for signed, unsigned and mixed multiplication, this article presents a novel extension to the Baugh-Wooley multiplication algorithm which reduces area and power consumption with roughly a factor three.

MasakhaNEWS: News Topic Classification for African languages

  • Authors: David Ifeoluwa Adelani, Marek Masiak, Israel Abebe Azime, Jesujoba Oluwadara Alabi, Atnafu Lambebo Tonja, Christine Mwase, Odunayo Ogundepo, Bonaventure F. P. Dossou, Akintunde Oladipo, Doreen Nixdorf, Chris Chinenye Emezue, Sana Sabah al-azzawi, Blessing K. Sibanda, Davis David, Lolwethu Ndolela, Jonathan Mukiibi, Tunde Oluwaseyi Ajayi, Tatiana Moteu Ngoli, Brian Odhiambo, Abraham Toluwase Owodunni, Nnaemeka C. Obiefuna, Shamsuddeen Hassan Muhammad, Saheed Salahudeen Abdullahi, Mesay Gemeda Yigezu, Tajuddeen Gwadabe, Idris Abdulmumin, Mahlet Taye Bame, Oluwabusayo Olufunke Awoyomi, Iyanuoluwa Shode, Tolulope Anu Adelani, Habiba Abdulganiy Kailani, Abdul-Hakeem Omotayo, Adetola Adeeko, Afolabi Abeeb, Anuoluwapo Aremu, Olanrewaju Samuel, Clemencia Siro, Wangari Kimotho, Onyekachi Raphael Ogbu, et al. (23 additional authors not shown)
  • Subjects: Computation and Language (cs.CL)
  • Arxiv link: https://arxiv.org/abs/2304.09972
  • Pdf link: https://arxiv.org/pdf/2304.09972
  • Abstract
    African languages are severely under-represented in NLP research due to lack of datasets covering several NLP tasks. While there are individual language specific datasets that are being expanded to different tasks, only a handful of NLP tasks (e.g. named entity recognition and machine translation) have standardized benchmark datasets covering several geographical and typologically-diverse African languages. In this paper, we develop MasakhaNEWS -- a new benchmark dataset for news topic classification covering 16 languages widely spoken in Africa. We provide an evaluation of baseline models by training classical machine learning models and fine-tuning several language models. Furthermore, we explore several alternatives to full fine-tuning of language models that are better suited for zero-shot and few-shot learning such as cross-lingual parameter-efficient fine-tuning (like MAD-X), pattern exploiting training (PET), prompting language models (like ChatGPT), and prompt-free sentence transformer fine-tuning (SetFit and Cohere Embedding API). Our evaluation in zero-shot setting shows the potential of prompting ChatGPT for news topic classification in low-resource African languages, achieving an average performance of 70 F1 points without leveraging additional supervision like MAD-X. In few-shot setting, we show that with as little as 10 examples per label, we achieved more than 90% (i.e. 86.0 F1 points) of the performance of full supervised training (92.6 F1 points) leveraging the PET approach.

Equilibrium-Invariant Embedding, Metric Space, and Fundamental Set of $2\times2$ Normal-Form Games

  • Authors: Luke Marris, Ian Gemp, Georgios Piliouras
  • Subjects: Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA); Theoretical Economics (econ.TH); Optimization and Control (math.OC)
  • Arxiv link: https://arxiv.org/abs/2304.09978
  • Pdf link: https://arxiv.org/pdf/2304.09978
  • Abstract
    Equilibrium solution concepts of normal-form games, such as Nash equilibria, correlated equilibria, and coarse correlated equilibria, describe the joint strategy profiles from which no player has incentive to unilaterally deviate. They are widely studied in game theory, economics, and multiagent systems. Equilibrium concepts are invariant under certain transforms of the payoffs. We define an equilibrium-inspired distance metric for the space of all normal-form games and uncover a distance-preserving equilibrium-invariant embedding. Furthermore, we propose an additional transform which defines a better-response-invariant distance metric and embedding. To demonstrate these metric spaces we study $2\times2$ games. The equilibrium-invariant embedding of $2\times2$ games has an efficient two variable parameterization (a reduction from eight), where each variable geometrically describes an angle on a unit circle. Interesting properties can be spatially inferred from the embedding, including: equilibrium support, cycles, competition, coordination, distances, best-responses, and symmetries. The best-response-invariant embedding of $2\times2$ games, after considering symmetries, rediscovers a set of 15 games, and their respective equivalence classes. We propose that this set of game classes is fundamental and captures all possible interesting strategic interactions in $2\times2$ games. We introduce a directed graph representation and name for each class. Finally, we leverage the tools developed for $2\times2$ games to develop game theoretic visualizations of large normal-form and extensive-form games that aim to fingerprint the strategic interactions that occur within.

Tetra-NeRF: Representing Neural Radiance Fields Using Tetrahedra

  • Authors: Jonas Kulhanek, Torsten Sattler
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.09987
  • Pdf link: https://arxiv.org/pdf/2304.09987
  • Abstract
    Neural Radiance Fields (NeRFs) are a very recent and very popular approach for the problems of novel view synthesis and 3D reconstruction. A popular scene representation used by NeRFs is to combine a uniform, voxel-based subdivision of the scene with an MLP. Based on the observation that a (sparse) point cloud of the scene is often available, this paper proposes to use an adaptive representation based on tetrahedra and a Delaunay representation instead of the uniform subdivision or point-based representations. We show that such a representation enables efficient training and leads to state-of-the-art results. Our approach elegantly combines concepts from 3D geometry processing, triangle-based rendering, and modern neural radiance fields. Compared to voxel-based representations, ours provides more detail around parts of the scene likely to be close to the surface. Compared to point-based representations, our approach achieves better performance.

AI-coherent data-driven forecasting model for a combined cycle power plant

  • Authors: Mir Sayed Shah Danish, Zahra Nazari, Tomonobu Senjyu
  • Subjects: Systems and Control (eess.SY)
  • Arxiv link: https://arxiv.org/abs/2304.10009
  • Pdf link: https://arxiv.org/pdf/2304.10009
  • Abstract
    This study investigates the transformation of energy models to align with machine learning requirements as a promising tool for optimizing the operation of combined cycle power plants (CCPPs). By modeling energy production as a function of environmental and control variables, this methodology offers an innovative way to achieve energy-efficient power generation in the context of the data-driven application. This study focuses on developing a thorough AI-coherent modeling approach for CCPP optimization, preferring an interdisciplinary perspective and coming up with a comprehensive, insightful analysis. The proposed numerical model using Broyden Fletcher Goldfarb Shanno (BFGS) algorithm enhances efficiency by simulating various operating scenarios and adjusting optimal parameters, leading to a high yield power generation of 2.23% increase from 452 MW to 462.1 MW by optimizing the environmental factors. This study deals with data-driven modeling based on historical data to make predictions without prior knowledge of the system's parameter, demonstrating several merits in identifying patterns that can be difficult for human analysts to detect, high accuracy when trained on large datasets, and the potential to improve over time with new data. The proposed modeling approach and methodology can be expanded as a valuable tool for forecasting and decision-making in complex energy systems.

Dynablox: Real-time Detection of Diverse Dynamic Objects in Complex Environments

  • Authors: Lukas Schmid, Olov Andersson, Aurelio Sulser, Patrick Pfreundschuh, Roland Siegwart
  • Subjects: Robotics (cs.RO)
  • Arxiv link: https://arxiv.org/abs/2304.10049
  • Pdf link: https://arxiv.org/pdf/2304.10049
  • Abstract
    Real-time detection of moving objects is an essential capability for robots acting autonomously in dynamic environments. We thus propose Dynablox, a novel online mapping-based approach for robust moving object detection in complex environments. The central idea of our approach is to incrementally estimate high confidence free-space areas by modeling and accounting for sensing, state estimation, and mapping limitations during online robot operation. The spatio-temporally conservative free space estimate enables robust detection of moving objects without making any assumptions on the appearance of objects or environments. This allows deployment in complex scenes such as multi-storied buildings or staircases, and for diverse moving objects such as people carrying various items, doors swinging or even balls rolling around. We thoroughly evaluate our approach on real-world data sets, achieving 86% IoU at 17 FPS in typical robotic settings. The method outperforms a recent appearance-based classifier and approaches the performance of offline methods. We demonstrate its generality on a novel data set with rare moving objects in complex environments. We make our efficient implementation and the novel data set available as open-source.

Maximize the Long-term Average Revenue of Network Slice Provider via Admission Control Among Heterogeneous Slices

  • Authors: Miao Dai, Gang Sun, Hongfang Yu, Dusit Niyato
  • Subjects: Networking and Internet Architecture (cs.NI)
  • Arxiv link: https://arxiv.org/abs/2304.10057
  • Pdf link: https://arxiv.org/pdf/2304.10057
  • Abstract
    Network slicing endows 5G/B5G with differentiated and customized capabilities to cope with the proliferation of diversified services, whereas limited physical network resources may not be able to support all service requests. Slice admission control is regarded as an essential means to ensure service quality and service isolation when the network is under burden. Herein, the scenario where rational tenants coexist with partially competitive network slice providers is adopted. We aim to maximize the long-term average revenue of the network operators through slice admission control, with the feasibility of multidimensional resource requirements, the priority differences among heterogeneous slices, and the admission fairness within each slice taken into account concurrently. We prove the intractability of our problem by a reduction from the Multidimensional Knapsack Problem (MKP), and propose a two-stage algorithm called MPSAC to make a sub-optimal solution efficiently. The principle of MPSAC is to split the original problem into two sub-problems; inter-slice decision-making and intra-slice quota allocation, which are solved using a heuristic method and a tailored auction mechanism respectively. Extensive simulations are carried out to demonstrate the efficacy of our algorithm, the results show that the long-term average revenue of ours is at least 9.6% higher than comparisons while maintaining better priority relations and achieving improved fairness performance.

High-Performance and Flexible Parallel Algorithms for Semisort and Related Problems

  • Authors: Xiaojun Dong, Yunshu Wu, Zhongqi Wang, Laxman Dhulipala, Yan Gu, Yihan Sun
  • Subjects: Data Structures and Algorithms (cs.DS)
  • Arxiv link: https://arxiv.org/abs/2304.10078
  • Pdf link: https://arxiv.org/pdf/2304.10078
  • Abstract
    Semisort is a fundamental algorithmic primitive widely used in the design and analysis of efficient parallel algorithms. It takes input as an array of records and a function extracting a \emph{key} per record, and reorders them so that records with equal keys are contiguous. Since many applications only require collecting equal values, but not fully sorting the input, semisort is broadly applicable, e.g., in string algorithms, graph analytics, and geometry processing, among many other domains. However, despite dozens of recent papers that use semisort in their theoretical analysis and the existence of an asymptotically optimal parallel semisort algorithm, most implementations of these parallel algorithms choose to implement semisort by using comparison or integer sorting in practice, due to potential performance issues in existing semisort implementations. In this paper, we revisit the semisort problem, with the goal of achieving a high-performance parallel semisort implementation with a flexible interface. Our approach can easily extend to two related problems, \emph{histogram} and \emph{collect-reduce}. Our algorithms achieve strong speedups in practice, and importantly, outperform state-of-the-art parallel sorting and semisorting methods for almost all settings we tested, with varying input sizes, distribution, and key types. We also test two important applications with real-world data, and show that our algorithms improve the performance over existing approaches. We believe that many other parallel algorithm implementations can be accelerated using our results.

Transmit Power Minimization for STAR-RIS Empowered Symbiotic Radio Communications

  • Authors: Chao Zhou, Bin Lyu, Youhong Feng, Dinh Thai Hoang
  • Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
  • Arxiv link: https://arxiv.org/abs/2304.10095
  • Pdf link: https://arxiv.org/pdf/2304.10095
  • Abstract
    In this paper, we propose a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) empowered transmission scheme for symbiotic radio (SR) systems to make more flexibility for network deployment and enhance system performance. The STAR-RIS is utilized to not only beam the primary signals from the base station (BS) towards multiple primary users on the same side of the STAR-RIS, but also achieve the secondary transmission to the secondary users on another side. We consider both the broadcasting signal model and unicasting signal model at the BS. For each model, we aim for minimizing the transmit power of the BS by designing the active beamforming and simultaneous reflection and transmission coefficients under the practical phase correlation constraint. To address the challenge of solving the formulated problem, we propose a block coordinate descent based algorithm with the semidefinite relaxation, penalty dual decomposition and successive convex approximation methods, which decomposes the original problem into one sub-problem about active beamforming and the other sub-problem about simultaneous reflection and transmission coefficients, and iteratively solve them until the convergence is achieved. Numerical results indicate that the proposed scheme can reduce up to 150.6% transmit power compared to the backscattering device enabled scheme.

Two-Memory Reinforcement Learning

  • Authors: Zhao Yang, Thomas. M. Moerland, Mike Preuss, Aske Plaat
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10098
  • Pdf link: https://arxiv.org/pdf/2304.10098
  • Abstract
    While deep reinforcement learning has shown important empirical success, it tends to learn relatively slow due to slow propagation of rewards information and slow update of parametric neural networks. Non-parametric episodic memory, on the other hand, provides a faster learning alternative that does not require representation learning and uses maximum episodic return as state-action values for action selection. Episodic memory and reinforcement learning both have their own strengths and weaknesses. Notably, humans can leverage multiple memory systems concurrently during learning and benefit from all of them. In this work, we propose a method called Two-Memory reinforcement learning agent (2M) that combines episodic memory and reinforcement learning to distill both of their strengths. The 2M agent exploits the speed of the episodic memory part and the optimality and the generalization capacity of the reinforcement learning part to complement each other. Our experiments demonstrate that the 2M agent is more data efficient and outperforms both pure episodic memory and pure reinforcement learning, as well as a state-of-the-art memory-augmented RL agent. Moreover, the proposed approach provides a general framework that can be used to combine any episodic memory agent with other off-policy reinforcement learning algorithms.

Decouple Graph Neural Networks: Train Multiple Simple GNNs Simultaneously Instead of One

  • Authors: Hongyuan Zhang, Yanan Zhu, Xuelong Li
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10126
  • Pdf link: https://arxiv.org/pdf/2304.10126
  • Abstract
    Graph neural networks (GNN) suffer from severe inefficiency. It is mainly caused by the exponential growth of node dependency with the increase of layers. It extremely limits the application of stochastic optimization algorithms so that the training of GNN is usually time-consuming. To address this problem, we propose to decouple a multi-layer GNN as multiple simple modules for more efficient training, which is comprised of classical forward training (FT)and designed backward training (BT). Under the proposed framework, each module can be trained efficiently in FT by stochastic algorithms without distortion of graph information owing to its simplicity. To avoid the only unidirectional information delivery of FT and sufficiently train shallow modules with the deeper ones, we develop a backward training mechanism that makes the former modules perceive the latter modules. The backward training introduces the reversed information delivery into the decoupled modules as well as the forward information delivery. To investigate how the decoupling and greedy training affect the representational capacity, we theoretically prove that the error produced by linear modules will not accumulate on unsupervised tasks in most cases. The theoretical and experimental results show that the proposed framework is highly efficient with reasonable performance.

Securing Semantic Communications with Physical-layer Semantic Encryption and Obfuscation

  • Authors: Qi Qin, Yankai Rong, Guoshun Nan, Shaokang Wu, Xuefei Zhang, Qimei Cui, Xiaofeng Tao
  • Subjects: Cryptography and Security (cs.CR)
  • Arxiv link: https://arxiv.org/abs/2304.10147
  • Pdf link: https://arxiv.org/pdf/2304.10147
  • Abstract
    Deep learning based semantic communication(DLSC) systems have shown great potential of making wireless networks significantly more efficient by only transmitting the semantics of the data. However, the open nature of wireless channel and fragileness of neural models cause DLSC systems extremely vulnerable to various attacks. Traditional wireless physical layer key (PLK), which relies on reciprocal channel and randomness characteristics between two legitimate users, holds the promise of securing DLSC. The main challenge lies in generating secret keys in the static environment with ultra-low/zero rate. Different from prior efforts that use relays or reconfigurable intelligent surfaces (RIS) to manipulate wireless channels, this paper proposes a novel physical layer semantic encryption scheme by exploring the randomness of bilingual evaluation understudy (BLEU) scores in the field of machine translation, and additionally presents a novel semantic obfuscation mechanism to provide further physical layer protections. Specifically, 1) we calculate the BLEU scores and corresponding weights of the DLSC system. Then, we generate semantic keys (SKey) by feeding the weighted sum of the scores into a hash function. 2) Equipped with the SKey, our proposed subcarrier obfuscation is able to further secure semantic communications with a dynamic dummy data insertion mechanism. Experiments show the effectiveness of our method, especially in the static wireless environment.

Is ChatGPT a Good Recommender? A Preliminary Study

  • Authors: Junling Liu, Chao Liu, Renjie Lv, Kang Zhou, Yan Zhang
  • Subjects: Information Retrieval (cs.IR)
  • Arxiv link: https://arxiv.org/abs/2304.10149
  • Pdf link: https://arxiv.org/pdf/2304.10149
  • Abstract
    Recommendation systems have witnessed significant advancements and have been widely used over the past decades. However, most traditional recommendation methods are task-specific and therefore lack efficient generalization ability. Recently, the emergence of ChatGPT has significantly advanced NLP tasks by enhancing the capabilities of conversational models. Nonetheless, the application of ChatGPT in the recommendation domain has not been thoroughly investigated. In this paper, we employ ChatGPT as a general-purpose recommendation model to explore its potential for transferring extensive linguistic and world knowledge acquired from large-scale corpora to recommendation scenarios. Specifically, we design a set of prompts and evaluate ChatGPT's performance on five recommendation scenarios. Unlike traditional recommendation methods, we do not fine-tune ChatGPT during the entire evaluation process, relying only on the prompts themselves to convert recommendation tasks into natural language tasks. Further, we explore the use of few-shot prompting to inject interaction information that contains user potential interest to help ChatGPT better understand user needs and interests. Comprehensive experimental results on Amazon Beauty dataset show that ChatGPT has achieved promising results in certain tasks and is capable of reaching the baseline level in others. We conduct human evaluations on two explainability-oriented tasks to more accurately evaluate the quality of contents generated by different models. And the human evaluations show ChatGPT can truly understand the provided information and generate clearer and more reasonable results. We hope that our study can inspire researchers to further explore the potential of language models like ChatGPT to improve recommendation performance and contribute to the advancement of the recommendation systems field.

Automated Dynamic Bayesian Networks for Predicting Acute Kidney Injury Before Onset

  • Authors: David Gordon, Panayiotis Petousis, Anders O. Garlid, Keith Norris, Katherine Tuttle, Susanne B. Nicholas, Alex A.T. Bui (on behalf of CURE-CKD)
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10175
  • Pdf link: https://arxiv.org/pdf/2304.10175
  • Abstract
    Several algorithms for learning the structure of dynamic Bayesian networks (DBNs) require an a priori ordering of variables, which influences the determined graph topology. However, it is often unclear how to determine this order if feature importance is unknown, especially as an exhaustive search is usually impractical. In this paper, we introduce Ranking Approaches for Unknown Structures (RAUS), an automated framework to systematically inform variable ordering and learn networks end-to-end. RAUS leverages existing statistical methods (Cramers V, chi-squared test, and information gain) to compare variable ordering, resultant generated network topologies, and DBN performance. RAUS enables end-users with limited DBN expertise to implement models via command line interface. We evaluate RAUS on the task of predicting impending acute kidney injury (AKI) from inpatient clinical laboratory data. Longitudinal observations from 67,460 patients were collected from our electronic health record (EHR) and Kidney Disease Improving Global Outcomes (KDIGO) criteria were then applied to define AKI events. RAUS learns multiple DBNs simultaneously to predict a future AKI event at different time points (i.e., 24-, 48-, 72-hours in advance of AKI). We also compared the results of the learned AKI prediction models and variable orderings to baseline techniques (logistic regression, random forests, and extreme gradient boosting). The DBNs generated by RAUS achieved 73-83% area under the receiver operating characteristic curve (AUCROC) within 24-hours before AKI; and 71-79% AUCROC within 48-hours before AKI of any stage in a 7-day observation window. Insights from this automated framework can help efficiently implement and interpret DBNs for clinical decision support. The source code for RAUS is available in GitHub at https://github.com/dgrdn08/RAUS .

Robust Deep Reinforcement Learning Scheduling via Weight Anchoring

  • Authors: Steffen Gracla, Edgar Beck, Carsten Bockelmann, Armin Dekorsy
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Systems and Control (eess.SY)
  • Arxiv link: https://arxiv.org/abs/2304.10176
  • Pdf link: https://arxiv.org/pdf/2304.10176
  • Abstract
    Questions remain on the robustness of data-driven learning methods when crossing the gap from simulation to reality. We utilize weight anchoring, a method known from continual learning, to cultivate and fixate desired behavior in Neural Networks. Weight anchoring may be used to find a solution to a learning problem that is nearby the solution of another learning problem. Thereby, learning can be carried out in optimal environments without neglecting or unlearning desired behavior. We demonstrate this approach on the example of learning mixed QoS-efficient discrete resource scheduling with infrequent priority messages. Results show that this method provides performance comparable to the state of the art of augmenting a simulation environment, alongside significantly increased robustness and steerability.

Regularizing Second-Order Influences for Continual Learning

  • Authors: Zhicheng Sun, Yadong Mu, Gang Hua
  • Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10177
  • Pdf link: https://arxiv.org/pdf/2304.10177
  • Abstract
    Continual learning aims to learn on non-stationary data streams without catastrophically forgetting previous knowledge. Prevalent replay-based methods address this challenge by rehearsing on a small buffer holding the seen data, for which a delicate sample selection strategy is required. However, existing selection schemes typically seek only to maximize the utility of the ongoing selection, overlooking the interference between successive rounds of selection. Motivated by this, we dissect the interaction of sequential selection steps within a framework built on influence functions. We manage to identify a new class of second-order influences that will gradually amplify incidental bias in the replay buffer and compromise the selection process. To regularize the second-order effects, a novel selection objective is proposed, which also has clear connections to two widely adopted criteria. Furthermore, we present an efficient implementation for optimizing the proposed criterion. Experiments on multiple continual learning benchmarks demonstrate the advantage of our approach over state-of-the-art methods. Code is available at https://github.com/feifeiobama/InfluenceCL.

Efficient Uncertainty Estimation in Spiking Neural Networks via MC-dropout

  • Authors: Tao Sun, Bojian Yin, Sander Bohte
  • Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.10191
  • Pdf link: https://arxiv.org/pdf/2304.10191
  • Abstract
    Spiking neural networks (SNNs) have gained attention as models of sparse and event-driven communication of biological neurons, and as such have shown increasing promise for energy-efficient applications in neuromorphic hardware. As with classical artificial neural networks (ANNs), predictive uncertainties are important for decision making in high-stakes applications, such as autonomous vehicles, medical diagnosis, and high frequency trading. Yet, discussion of uncertainty estimation in SNNs is limited, and approaches for uncertainty estimation in artificial neural networks (ANNs) are not directly applicable to SNNs. Here, we propose an efficient Monte Carlo(MC)-dropout based approach for uncertainty estimation in SNNs. Our approach exploits the time-step mechanism of SNNs to enable MC-dropout in a computationally efficient manner, without introducing significant overheads during training and inference while demonstrating high accuracy and uncertainty quality.

Selective and Collaborative Influence Function for Efficient Recommendation Unlearning

  • Authors: Yuyuan Li, Chaochao Chen, Xiaolin Zheng, Yizhao Zhang, Biao Gong, Jun Wang
  • Subjects: Information Retrieval (cs.IR)
  • Arxiv link: https://arxiv.org/abs/2304.10199
  • Pdf link: https://arxiv.org/pdf/2304.10199
  • Abstract
    Recent regulations on the Right to be Forgotten have greatly influenced the way of running a recommender system, because users now have the right to withdraw their private data. Besides simply deleting the target data in the database, unlearning the associated data lineage e.g., the learned personal features and preferences in the model, is also necessary for data withdrawal. Existing unlearning methods are mainly devised for generalized machine learning models in classification tasks. In this paper, we first identify two main disadvantages of directly applying existing unlearning methods in the context of recommendation, i.e., (i) unsatisfactory efficiency for large-scale recommendation models and (ii) destruction of collaboration across users and items. To tackle the above issues, we propose an extra-efficient recommendation unlearning method based on Selective and Collaborative Influence Function (SCIF). Our proposed method can (i) avoid any kind of retraining which is computationally prohibitive for large-scale systems, (ii) further enhance efficiency by selectively updating user embedding and (iii) preserve the collaboration across the remaining users and items. Furthermore, in order to evaluate the unlearning completeness, we define a Membership Inference Oracle (MIO), which can justify whether the unlearned data points were in the training set of the model, i.e., whether a data point was completely unlearned. Extensive experiments on two benchmark datasets demonstrate that our proposed method can not only greatly enhance unlearning efficiency, but also achieve adequate unlearning completeness. More importantly, our proposed method outperforms the state-of-the-art unlearning method regarding comprehensive recommendation metrics.

Spiking-Fer: Spiking Neural Network for Facial Expression Recognition With Event Cameras

  • Authors: Sami Barchid, Benjamin Allaert, Amel Aissaoui, José Mennesson, Chaabane Djéraba
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10211
  • Pdf link: https://arxiv.org/pdf/2304.10211
  • Abstract
    Facial Expression Recognition (FER) is an active research domain that has shown great progress recently, notably thanks to the use of large deep learning models. However, such approaches are particularly energy intensive, which makes their deployment difficult for edge devices. To address this issue, Spiking Neural Networks (SNNs) coupled with event cameras are a promising alternative, capable of processing sparse and asynchronous events with lower energy consumption. In this paper, we establish the first use of event cameras for FER, named "Event-based FER", and propose the first related benchmarks by converting popular video FER datasets to event streams. To deal with this new task, we propose "Spiking-FER", a deep convolutional SNN model, and compare it against a similar Artificial Neural Network (ANN). Experiments show that the proposed approach achieves comparable performance to the ANN architecture, while consuming less energy by orders of magnitude (up to 65.39x). In addition, an experimental study of various event-based data augmentation techniques is performed to provide insights into the efficient transformations specific to event-based FER.

Dynamic Security Region of Natural Gas Systems in Integrated Electricity-Gas Systems

  • Authors: Han Gao, Peiyao Zhao, Zhengshuo Li
  • Subjects: Systems and Control (eess.SY)
  • Arxiv link: https://arxiv.org/abs/2304.10215
  • Pdf link: https://arxiv.org/pdf/2304.10215
  • Abstract
    In an integrated electricity-gas system (IEGS), the tight coupling of power and natural gas systems is embodied by frequent changes in gas withdrawal from gas-fired units to provide regulation services for the power system to handle uncertainty, which may in turn endanger the secure operation of the natural gas system and ultimately affect the safety of the whole IEGS. Hence, it is necessary to accurately and efficiently evaluate the dynamic security region (DSR) of the natural gas system in the IEGS by considering the real-time dynamic characteristics of natural gas systems, which are not satisfactorily handled in state-of-the-art works. To bridge this gap, this paper first conceptionally verifies the necessity of the DSR and establishes its mathematical model. Then, a dimensionality reduction method is proposed for the efficient solution and visualization of the high-dimensional DSR evaluation model. A fast evaluation (FE) algorithm is developed to address the difficulties of the nonconvex dynamic constraints in the reduced DSR model. Finally, the necessity and notable advantages of the proposed DSR model and FE are verified based on small and relatively large test systems in comparison with common security region models and algorithms. To the best of our knowledge, this is the first paper that comprehensively presents models and efficient algorithms regarding the DSR of natural gas systems in an IEGS.

An Analysis of the Completion Time of the BB84 Protocol

  • Authors: Sounak Kar, Jean-Yves Le Boudec
  • Subjects: Performance (cs.PF); Quantum Physics (quant-ph)
  • Arxiv link: https://arxiv.org/abs/2304.10218
  • Pdf link: https://arxiv.org/pdf/2304.10218
  • Abstract
    The BB84 QKD protocol is based on the idea that the sender and the receiver can reconcile a certain fraction of the teleported qubits to detect eavesdropping or noise and decode the rest to use as a private key. Under the present hardware infrastructure, decoherence of quantum states poses a significant challenge to performing perfect or efficient teleportation, meaning that a teleportation-based protocol must be run multiple times to observe success. Thus, performance analyses of such protocols usually consider the completion time, i.e., the time until success, rather than the duration of a single attempt. Moreover, due to decoherence, the success of an attempt is in general dependent on the duration of individual phases of that attempt, as quantum states must wait in memory while the success or failure of a generation phase is communicated to the relevant parties. In this work, we do a performance analysis of the completion time of the BB84 protocol in a setting where the sender and the receiver are connected via a single quantum repeater and the only quantum channel between them does not see any adversarial attack. Assuming certain distributional forms for the generation and communication phases of teleportation, we provide a method to compute the MGF of the completion time and subsequently derive an estimate of the CDF and a bound on the tail probability. This result helps us gauge the (tail) behaviour of the completion time in terms of the parameters characterising the elementary phases of teleportation, without having to run the protocol multiple times. We also provide an efficient simulation scheme to generate the completion time, which relies on expressing the completion time in terms of aggregated teleportation times. We numerically compare our approach with a full-scale simulation and observe good agreement between them.

PED-ANOVA: Efficiently Quantifying Hyperparameter Importance in Arbitrary Subspaces

  • Authors: Shuhei Watanabe, Archit Bansal, Frank Hutter
  • Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
  • Arxiv link: https://arxiv.org/abs/2304.10255
  • Pdf link: https://arxiv.org/pdf/2304.10255
  • Abstract
    The recent rise in popularity of Hyperparameter Optimization (HPO) for deep learning has highlighted the role that good hyperparameter (HP) space design can play in training strong models. In turn, designing a good HP space is critically dependent on understanding the role of different HPs. This motivates research on HP Importance (HPI), e.g., with the popular method of functional ANOVA (f-ANOVA). However, the original f-ANOVA formulation is inapplicable to the subspaces most relevant to algorithm designers, such as those defined by top performance. To overcome this problem, we derive a novel formulation of f-ANOVA for arbitrary subspaces and propose an algorithm that uses Pearson divergence (PED) to enable a closed-form computation of HPI. We demonstrate that this new algorithm, dubbed PED-ANOVA, is able to successfully identify important HPs in different subspaces while also being extremely computationally efficient.

PREIM3D: 3D Consistent Precise Image Attribute Editing from a Single Image

  • Authors: Jianhui Li, Jianmin Li, Haoji Zhang, Shilong Liu, Zhengyi Wang, Zihao Xiao, Kaiwen Zheng, Jun Zhu
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10263
  • Pdf link: https://arxiv.org/pdf/2304.10263
  • Abstract
    We study the 3D-aware image attribute editing problem in this paper, which has wide applications in practice. Recent methods solved the problem by training a shared encoder to map images into a 3D generator's latent space or by per-image latent code optimization and then edited images in the latent space. Despite their promising results near the input view, they still suffer from the 3D inconsistency of produced images at large camera poses and imprecise image attribute editing, like affecting unspecified attributes during editing. For more efficient image inversion, we train a shared encoder for all images. To alleviate 3D inconsistency at large camera poses, we propose two novel methods, an alternating training scheme and a multi-view identity loss, to maintain 3D consistency and subject identity. As for imprecise image editing, we attribute the problem to the gap between the latent space of real images and that of generated images. We compare the latent space and inversion manifold of GAN models and demonstrate that editing in the inversion manifold can achieve better results in both quantitative and qualitative evaluations. Extensive experiments show that our method produces more 3D consistent images and achieves more precise image editing than previous work. Source code and pretrained models can be found on our project page: https://mybabyyh.github.io/Preim3D/

Robust nonlinear set-point control with reinforcement learning

  • Authors: Ruoqi Zhang, Per Mattsson, Torbjörn Wigren
  • Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.10277
  • Pdf link: https://arxiv.org/pdf/2304.10277
  • Abstract
    There has recently been an increased interest in reinforcement learning for nonlinear control problems. However standard reinforcement learning algorithms can often struggle even on seemingly simple set-point control problems. This paper argues that three ideas can improve reinforcement learning methods even for highly nonlinear set-point control problems: 1) Make use of a prior feedback controller to aid amplitude exploration. 2) Use integrated errors. 3) Train on model ensembles. Together these ideas lead to more efficient training, and a trained set-point controller that is more robust to modelling errors and thus can be directly deployed to real-world nonlinear systems. The claim is supported by experiments with a real-world nonlinear cascaded tank process and a simulated strongly nonlinear pH-control system.

A baseline on continual learning methods for video action recognition

  • Authors: Giulia Castagnolo, Concetto Spampinato, Francesco Rundo, Daniela Giordano, Simone Palazzo
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10335
  • Pdf link: https://arxiv.org/pdf/2304.10335
  • Abstract
    Continual learning has recently attracted attention from the research community, as it aims to solve long-standing limitations of classic supervisedly-trained models. However, most research on this subject has tackled continual learning in simple image classification scenarios. In this paper, we present a benchmark of state-of-the-art continual learning methods on video action recognition. Besides the increased complexity due to the temporal dimension, the video setting imposes stronger requirements on computing resources for top-performing rehearsal methods. To counteract the increased memory requirements, we present two method-agnostic variants for rehearsal methods, exploiting measures of either model confidence or data information to select memorable samples. Our experiments show that, as expected from the literature, rehearsal methods outperform other approaches; moreover, the proposed memory-efficient variants are shown to be effective at retaining a certain level of performance with a smaller buffer size.

Engel's theorem in Mathlib

  • Authors: Oliver Nash
  • Subjects: Logic in Computer Science (cs.LO); Representation Theory (math.RT)
  • Arxiv link: https://arxiv.org/abs/2304.10424
  • Pdf link: https://arxiv.org/pdf/2304.10424
  • Abstract
    We discuss the theory of Lie algebras in Lean's Mathlib library. Using nilpotency as the theme, we outline a computer formalisation of Engel's theorem and an application to root space theory. We emphasise that all arguments work with coefficients in any commutative ring.

GPT-NER: Named Entity Recognition via Large Language Models

  • Authors: Shuhe Wang, Xiaofei Sun, Xiaoya Li, Rongbin Ouyang, Fei Wu, Tianwei Zhang, Jiwei Li, Guoyin Wang
  • Subjects: Computation and Language (cs.CL)
  • Arxiv link: https://arxiv.org/abs/2304.10428
  • Pdf link: https://arxiv.org/pdf/2304.10428
  • Abstract
    Despite the fact that large-scale Language Models (LLM) have achieved SOTA performances on a variety of NLP tasks, its performance on NER is still significantly below supervised baselines. This is due to the gap between the two tasks the NER and LLMs: the former is a sequence labeling task in nature while the latter is a text-generation model. In this paper, we propose GPT-NER to resolve this issue. GPT-NER bridges the gap by transforming the sequence labeling task to a generation task that can be easily adapted by LLMs e.g., the task of finding location entities in the input text "Columbus is a city" is transformed to generate the text sequence "@@columbus## is a city", where special tokens @@## marks the entity to extract. To efficiently address the "hallucination" issue of LLMs, where LLMs have a strong inclination to over-confidently label NULL inputs as entities, we propose a self-verification strategy by prompting LLMs to ask itself whether the extracted entities belong to a labeled entity tag. We conduct experiments on five widely adopted NER datasets, and GPT-NER achieves comparable performances to fully supervised baselines, which is the first time as far as we are concerned. More importantly, we find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce, GPT-NER performs significantly better than supervised models. This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.

Securing Neural Networks with Knapsack Optimization

  • Authors: Yakir Gorski, Shai Avidan
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10442
  • Pdf link: https://arxiv.org/pdf/2304.10442
  • Abstract
    Deep learning inference brings together the data and the Convolutional Neural Network (CNN). This is problematic in case the user wants to preserve the privacy of the data and the service provider does not want to reveal the weights of his CNN. Secure Inference allows the two parties to engage in a protocol that preserves their respective privacy concerns, while revealing only the inference result to the user. This is known as Multi-Party Computation (MPC). A major bottleneck of MPC algorithms is communication, as the parties must send data back and forth. The linear component of a CNN (i.e. convolutions) can be done efficiently with minimal communication, but the non-linear part (i.e., ReLU) requires the bulk of communication bandwidth. We propose two ways to accelerate Secure Inference. The first is based on the observation that the ReLU outcome of many convolutions is highly correlated. Therefore, we replace the per pixel ReLU operation by a ReLU operation per patch. Each layer in the network will benefit from a patch of a different size and we devise an algorithm to choose the optimal set of patch sizes through a novel reduction of the problem to a knapsack problem. The second way to accelerate Secure Inference is based on cutting the number of bit comparisons required for a secure ReLU operation. We demonstrate the cumulative effect of these tools in the semi-honest secure 3-party setting for four problems: Classifying ImageNet using ResNet50 backbone, classifying CIFAR100 using ResNet18 backbone, semantic segmentation of ADE20K using MobileNetV2 backbone and semantic segmentation of Pascal VOC 2012 using ResNet50 backbone. Our source code is publicly available: $\href{https://github.com/yg320/secure_inference}{\text{https://github.com/yg320/secure_inference}}$

Angle based dynamic learning rate for gradient descent

  • Authors: Neel Mishra, Pawan Kumar
  • Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10457
  • Pdf link: https://arxiv.org/pdf/2304.10457
  • Abstract
    In our work, we propose a novel yet simple approach to obtain an adaptive learning rate for gradient-based descent methods on classification tasks. Instead of the traditional approach of selecting adaptive learning rates via the decayed expectation of gradient-based terms, we use the angle between the current gradient and the new gradient: this new gradient is computed from the direction orthogonal to the current gradient, which further helps us in determining a better adaptive learning rate based on angle history, thereby, leading to relatively better accuracy compared to the existing state-of-the-art optimizers. On a wide variety of benchmark datasets with prominent image classification architectures such as ResNet, DenseNet, EfficientNet, and VGG, we find that our method leads to the highest accuracy in most of the datasets. Moreover, we prove that our method is convergent.

Reducing Aggregate Electric Vehicle Battery Capacity through Sharing

  • Authors: Polina Alexeenko, Vasileios Charisopoulos
  • Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
  • Arxiv link: https://arxiv.org/abs/2304.10461
  • Pdf link: https://arxiv.org/pdf/2304.10461
  • Abstract
    Meeting growing demand for automotive battery resources is predicted to be costly from both economic and environmental perspectives. To minimize these costs, battery resources should be deployed as efficiently as possible. A potential source of inefficiency in battery deployment is the fact that the batteries of personal vehicles are typically much larger than needed to meet most daily mobility needs. In this paper, we consider whether battery resources can be used more efficiently in a setting where drivers, in addition to having personal vehicle batteries, have access to a shared battery resource. More precisely, we consider the problem of minimizing aggregate battery capacity in settings with and without a shared resource subject to the requirement that driver commuting needs are met with high reliability. To assess the potential for reductions in deployed battery capacity with the addition of a shared resource, we quantify the difference in deployed battery capacity with and without a shared resource in case study using real-world longitudinal mobility data from Puget Sound, Washington. We find that giving drivers access to a shared battery resource can substantially reduces deployed battery capacity. Furthermore, relative reductions in battery capacity increase with number of drivers and the level of reliability desired.

Efficient Deep Reinforcement Learning Requires Regulating Overfitting

  • Authors: Qiyang Li, Aviral Kumar, Ilya Kostrikov, Sergey Levine
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
  • Arxiv link: https://arxiv.org/abs/2304.10466
  • Pdf link: https://arxiv.org/pdf/2304.10466
  • Abstract
    Deep reinforcement learning algorithms that learn policies by trial-and-error must learn from limited amounts of data collected by actively interacting with the environment. While many prior works have shown that proper regularization techniques are crucial for enabling data-efficient RL, a general understanding of the bottlenecks in data-efficient RL has remained unclear. Consequently, it has been difficult to devise a universal technique that works well across all domains. In this paper, we attempt to understand the primary bottleneck in sample-efficient deep RL by examining several potential hypotheses such as non-stationarity, excessive action distribution shift, and overfitting. We perform thorough empirical analysis on state-based DeepMind control suite (DMC) tasks in a controlled and systematic way to show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms, and prior methods that lead to good performance do in fact, control the validation TD error to be low. This observation gives us a robust principle for making deep RL efficient: we can hill-climb on the validation TD error by utilizing any form of regularization techniques from supervised learning. We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.

A primal dual mixed finite element method for inverse identification of the diffusion coefficient and its relation to the Kohn-Vogelius penalty method

  • Authors: Erik Burman
  • Subjects: Numerical Analysis (math.NA)
  • Arxiv link: https://arxiv.org/abs/2304.10467
  • Pdf link: https://arxiv.org/pdf/2304.10467
  • Abstract
    We revisit the celebrated Kohn-Vogelius penalty method and discuss how to use it for the unique continuation problem where data is given in the bulk of the domain. We then show that the primal-dual mixed finite element methods for the elliptic Cauchy problem introduced in \cite{BLO18} (\emph{E. Burman, M. Larson, L. Oksanen, Primal-dual mixed finite element methods for the elliptic Cauchy problem, SIAM J. Num. Anal., 56 (6), 2018}) can be interpreted as a Kohn-Vogelius penalty method and modify it to allow for unique continuation using data in the bulk. We prove that the resulting linear system is invertible for all data. Then we show that by introducing a singularly perturbed Robin condition on the discrete level sufficient regularization is obtained so that error estimates can be shown using conditional stability. Finally we show how the method can be used for the identification of the diffusivity coefficient in a second order elliptic operator with partial data. Some numerical examples are presented showing the performance of the method for unique continuation and for impedance computed tomography with partial data.

New Closed-Form ASER Expressions for Dual-Hop Mixed THz-RF Cooperative Relay Networks

  • Authors: Soumendu Das, Nagendra Kumar, Dharmendra Dixit
  • Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
  • Arxiv link: https://arxiv.org/abs/2304.10504
  • Pdf link: https://arxiv.org/pdf/2304.10504
  • Abstract
    In this paper, we consider a dual-hop mixed THz-RF system model for backhaul-fronthaul applications where the link between source and destination is established only through the relay node in which decode-and-forward relaying protocol is used. The THz link suffers from the joint impact of antenna misalignment and stochastic characteristics of wireless channels, including the effect of environmental conditions such as pressure, humidity, and temperature. The envelope of THz link in the first hop follows a generalized $\alpha-\mu$ distribution, and for the RF end, the Nakagami-$m$ distribution is considered. In this context, we obtain new closed-form expressions of the cumulative density function and the moment-generating function of the end-to-end signal-to-noise ratio. Further, we derive the average symbol error rate expressions for coherent rectangular quadrature amplitude modulation (RQAM) and coherent hexagonal QAM (HQAM), as well as the non-coherent modulation scheme. The asymptotic behavior is also discussed to examine the system's diversity. Furthermore, the impact of several parameters, such as fading coefficients of individual links and antenna misalignment, as well as the distance between nodes, are also highlighted in the system's performance. Moreover, Monte Carlo simulations are used to validate the presented analytical framework. Finally, the presented numerical insights aid in the extraction of practical design principles.

Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget

  • Authors: Johannes Lehner, Benedikt Alkin, Andreas Fürst, Elisabeth Rumetshofer, Lukas Miklautz, Sepp Hochreiter
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.10520
  • Pdf link: https://arxiv.org/pdf/2304.10520
  • Abstract
    Masked Image Modeling (MIM) methods, like Masked Autoencoders (MAE), efficiently learn a rich representation of the input. However, for adapting to downstream tasks, they require a sufficient amount of labeled data since their rich features capture not only objects but also less relevant image background. In contrast, Instance Discrimination (ID) methods focus on objects. In this work, we study how to combine the efficiency and scalability of MIM with the ability of ID to perform downstream classification in the absence of large amounts of labeled data. To this end, we introduce Masked Autoencoder Contrastive Tuning (MAE-CT), a sequential approach that applies Nearest Neighbor Contrastive Learning (NNCLR) to a pre-trained MAE. MAE-CT tunes the rich features such that they form semantic clusters of objects without using any labels. Applied to large and huge Vision Transformer (ViT) models, MAE-CT matches or excels previous self-supervised methods trained on ImageNet in linear probing, k-NN and low-shot classification accuracy as well as in unsupervised clustering accuracy. Notably, similar results can be achieved without additional image augmentations. While ID methods generally rely on hand-crafted augmentations to avoid shortcut learning, we find that nearest neighbor lookup is sufficient and that this data-driven augmentation effect improves with model size. MAE-CT is compute efficient. For instance, starting from a MAE pre-trained ViT-L/16, MAE-CT increases the ImageNet 1% low-shot accuracy from 67.7% to 72.6%, linear probing accuracy from 76.0% to 80.2% and k-NN accuracy from 60.6% to 79.1% in just five hours using eight A100 GPUs.

Learning Narrow One-Hidden-Layer ReLU Networks

  • Authors: Sitan Chen, Zehao Dou, Surbhi Goel, Adam R Klivans, Raghu Meka
  • Subjects: Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
  • Arxiv link: https://arxiv.org/abs/2304.10524
  • Pdf link: https://arxiv.org/pdf/2304.10524
  • Abstract
    We consider the well-studied problem of learning a linear combination of $k$ ReLU activations with respect to a Gaussian distribution on inputs in $d$ dimensions. We give the first polynomial-time algorithm that succeeds whenever $k$ is a constant. All prior polynomial-time learners require additional assumptions on the network, such as positive combining coefficients or the matrix of hidden weight vectors being well-conditioned. Our approach is based on analyzing random contractions of higher-order moment tensors. We use a multi-scale analysis to argue that sufficiently close neurons can be collapsed together, sidestepping the conditioning issues present in prior work. This allows us to design an iterative procedure to discover individual neurons.

Learning Sparse and Low-Rank Priors for Image Recovery via Iterative Reweighted Least Squares Minimization

  • Authors: Stamatios Lefkimmiatis, Iaroslav Koshelev
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
  • Arxiv link: https://arxiv.org/abs/2304.10536
  • Pdf link: https://arxiv.org/pdf/2304.10536
  • Abstract
    We introduce a novel optimization algorithm for image recovery under learned sparse and low-rank constraints, which we parameterize as weighted extensions of the $\ell_p^p$-vector and $\mathcal S_p^p$ Schatten-matrix quasi-norms for $0!<p!\le1$, respectively. Our proposed algorithm generalizes the Iteratively Reweighted Least Squares (IRLS) method, used for signal recovery under $\ell_1$ and nuclear-norm constrained minimization. Further, we interpret our overall minimization approach as a recurrent network that we then employ to deal with inverse low-level computer vision problems. Thanks to the convergence guarantees that our IRLS strategy offers, we are able to train the derived reconstruction networks using a memory-efficient implicit back-propagation scheme, which does not pose any restrictions on their effective depth. To assess our networks' performance, we compare them against other existing reconstruction methods on several inverse problems, namely image deblurring, super-resolution, demosaicking and sparse recovery. Our reconstruction results are shown to be very competitive and in many cases outperform those of existing unrolled networks, whose number of parameters is orders of magnitude higher than that of our learned models.

Learning Neural Duplex Radiance Fields for Real-Time View Synthesis

  • Authors: Ziyu Wan, Christian Richardt, Aljaž Božič, Chao Li, Vijay Rengarajan, Seonghyeon Nam, Xiaoyu Xiang, Tuotuo Li, Bo Zhu, Rakesh Ranjan, Jing Liao
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
  • Arxiv link: https://arxiv.org/abs/2304.10537
  • Pdf link: https://arxiv.org/pdf/2304.10537
  • Abstract
    Neural radiance fields (NeRFs) enable novel view synthesis with unprecedented visual quality. However, to render photorealistic images, NeRFs require hundreds of deep multilayer perceptron (MLP) evaluations - for each pixel. This is prohibitively expensive and makes real-time rendering infeasible, even on powerful modern GPUs. In this paper, we propose a novel approach to distill and bake NeRFs into highly efficient mesh-based neural representations that are fully compatible with the massively parallel graphics rendering pipeline. We represent scenes as neural radiance features encoded on a two-layer duplex mesh, which effectively overcomes the inherent inaccuracies in 3D surface reconstruction by learning the aggregated radiance information from a reliable interval of ray-surface intersections. To exploit local geometric relationships of nearby pixels, we leverage screen-space convolutions instead of the MLPs used in NeRFs to achieve high-quality appearance. Finally, the performance of the whole framework is further boosted by a novel multi-view distillation optimization strategy. We demonstrate the effectiveness and superiority of our approach via extensive experiments on a range of standard datasets.

Keyword: faster

An Intent-based Framework for Vehicular Edge Computing

  • Authors: TianZhang He, Adel N. Toosi, Negin Akbari, Muhammed Tawfiqul Islam, Muhammad Aamir Cheema
  • Subjects: Networking and Internet Architecture (cs.NI); Distributed, Parallel, and Cluster Computing (cs.DC)
  • Arxiv link: https://arxiv.org/abs/2304.09916
  • Pdf link: https://arxiv.org/pdf/2304.09916
  • Abstract
    The rapid development of emerging vehicular edge computing (VEC) brings new opportunities and challenges for dynamic resource management. The increasing number of edge data centers, roadside units (RSUs), and network devices, however, makes resource management a complex task in VEC. On the other hand, the exponential growth of service applications and end-users makes corresponding QoS hard to maintain. Intent-Based Networking (IBN), based on Software-Defined Networking, was introduced to provide the ability to automatically handle and manage the networking requirements of different applications. Motivated by the IBN concept, in this paper, we propose a novel approach to jointly orchestrate networking and computing resources based on user requirements. The proposed solution constantly monitors user requirements and dynamically re-configures the system to satisfy desired states of the application. We compared our proposed solution with the state-of-the-art networking embedding algorithms using real-world taxi GPS traces. Results show that our proposed method is significantly faster (up to 95%) and can improve resource utilization (up to 76%) and the acceptance ratio of computing and networking requests with various priorities (up to 71%). We also present a small-scale prototype of the proposed intent management framework to validate our solution.

Speed Me up if You Can: Conditional Lower Bounds on Opacity Verification

  • Authors: Jiří Balun, Tomáš Masopust, Petr Osička
  • Subjects: Formal Languages and Automata Theory (cs.FL); Systems and Control (eess.SY)
  • Arxiv link: https://arxiv.org/abs/2304.09920
  • Pdf link: https://arxiv.org/pdf/2304.09920
  • Abstract
    Opacity is a property of privacy and security applications asking whether, given a system model, a passive intruder that makes online observations of system's behaviour can ascertain some "secret" information of the system. Deciding opacity is a PSpace-complete problem, and hence there are no polynomial-time algorithms to verify opacity under the assumption that PSpace differs from PTime. This assumption, however, gives rise to a question whether the existing exponential-time algorithms are the best possible or whether there are faster, sub-exponential-time algorithms. We show that under the (Strong) Exponential Time Hypothesis, there are no algorithms that would be significantly faster than the existing algorithms. As a by-product, we obtained a new conditional lower bound on the time complexity of deciding universality (and therefore also inclusion and equivalence) for nondeterministic finite automata.

Two-Memory Reinforcement Learning

  • Authors: Zhao Yang, Thomas. M. Moerland, Mike Preuss, Aske Plaat
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10098
  • Pdf link: https://arxiv.org/pdf/2304.10098
  • Abstract
    While deep reinforcement learning has shown important empirical success, it tends to learn relatively slow due to slow propagation of rewards information and slow update of parametric neural networks. Non-parametric episodic memory, on the other hand, provides a faster learning alternative that does not require representation learning and uses maximum episodic return as state-action values for action selection. Episodic memory and reinforcement learning both have their own strengths and weaknesses. Notably, humans can leverage multiple memory systems concurrently during learning and benefit from all of them. In this work, we propose a method called Two-Memory reinforcement learning agent (2M) that combines episodic memory and reinforcement learning to distill both of their strengths. The 2M agent exploits the speed of the episodic memory part and the optimality and the generalization capacity of the reinforcement learning part to complement each other. Our experiments demonstrate that the 2M agent is more data efficient and outperforms both pure episodic memory and pure reinforcement learning, as well as a state-of-the-art memory-augmented RL agent. Moreover, the proposed approach provides a general framework that can be used to combine any episodic memory agent with other off-policy reinforcement learning algorithms.

ZEBRA: Z-order Curve-based Event Retrieval Approach to Efficiently Explore Automotive Data

  • Authors: Christian Berger, Lukas Birkemeyer
  • Subjects: Information Retrieval (cs.IR)
  • Arxiv link: https://arxiv.org/abs/2304.10232
  • Pdf link: https://arxiv.org/pdf/2304.10232
  • Abstract
    Evaluating the performance of software for automated vehicles is predominantly driven by data collected from the real world. While professional test drivers are supported with technical means to semi-automatically annotate driving maneuvers to allow better event identification, simple data loggers in large vehicle fleets typically lack automatic and detailed event classification and hence, extra effort is needed when post-processing such data. Yet, the data quality from professional test drivers is apparently higher than the one from large fleets where labels are missing, but the non-annotated data set from large vehicle fleets is much more representative for typical, realistic driving scenarios to be handled by automated vehicles. However, while growing the data from large fleets is relatively simple, adding valuable annotations during post-processing has become increasingly expensive. In this paper, we leverage Z-order space-filling curves to systematically reduce data dimensionality while preserving domain-specific data properties, which allows us to explore even large-scale field data sets to spot interesting events orders of magnitude faster than processing time-series data directly. Furthermore, the proposed concept is based on an analytical approach, which preserves explainability for the identified events.

Observer-Feedback-Feedforward Controller Structures in Reinforcement Learning

  • Authors: Ruoqi Zhang, Per Mattson, Torbjörn Wigren
  • Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.10276
  • Pdf link: https://arxiv.org/pdf/2304.10276
  • Abstract
    The paper proposes the use of structured neural networks for reinforcement learning based nonlinear adaptive control. The focus is on partially observable systems, with separate neural networks for the state and feedforward observer and the state feedback and feedforward controller. The observer dynamics are modelled by recurrent neural networks while a standard network is used for the controller. As discussed in the paper, this leads to a separation of the observer dynamics to the recurrent neural network part, and the state feedback to the feedback and feedforward network. The structured approach reduces the computational complexity and gives the reinforcement learning based controller an {\em understandable} structure as compared to when one single neural network is used. As shown by simulation the proposed structure has the additional and main advantage that the training becomes significantly faster. Two ways to include feedforward structure are presented, one related to state feedback control and one related to classical feedforward control. The latter method introduces further structure with a separate recurrent neural network that processes only the measured disturbance. When evaluated with simulation on a nonlinear cascaded double tank process, the method with most structure performs the best, with excellent feedforward disturbance rejection gains.

Regret-Minimizing Double Oracle for Extensive-Form Games

  • Authors: Xiaohang Tang, Le Cong Dinh, Stephen Marcus McAleer, Yaodong Yang
  • Subjects: Computer Science and Game Theory (cs.GT)
  • Arxiv link: https://arxiv.org/abs/2304.10498
  • Pdf link: https://arxiv.org/pdf/2304.10498
  • Abstract
    By incorporating regret minimization, double oracle methods have demonstrated rapid convergence to Nash Equilibrium (NE) in normal-form games and extensive-form games, through algorithms such as online double oracle (ODO) and extensive-form double oracle (XDO), respectively. In this study, we further examine the theoretical convergence rate and sample complexity of such regret minimization-based double oracle methods, utilizing a unified framework called Regret-Minimizing Double Oracle. Based on this framework, we extend ODO to extensive-form games and determine its sample complexity. Moreover, we demonstrate that the sample complexity of XDO can be exponential in the number of information sets $|S|$, owing to the exponentially decaying stopping threshold of restricted games. To solve this problem, we propose the Periodic Double Oracle (PDO) method, which has the lowest sample complexity among all existing double oracle methods, being only polynomial in $|S|$. Empirical evaluations on multiple poker and board games show that PDO achieves significantly faster convergence than previous double oracle algorithms and reaches a competitive level with state-of-the-art regret minimization methods.

Transformer Models for Type Inference in the Simply Typed Lambda Calculus: A Case Study in Deep Learning for Code

  • Authors: Brando Miranda, Avi Shinnar, Vasily Pestun, Barry Trager
  • Subjects: Programming Languages (cs.PL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Logic in Computer Science (cs.LO); Symbolic Computation (cs.SC)
  • Arxiv link: https://arxiv.org/abs/2304.10500
  • Pdf link: https://arxiv.org/pdf/2304.10500
  • Abstract
    Despite a growing body of work at the intersection of deep learning and formal languages, there has been relatively little systematic exploration of transformer models for reasoning about typed lambda calculi. This is an interesting area of inquiry for two reasons. First, typed lambda calculi are the lingua franc of programming languages. A set of heuristics that relate various typed lambda calculi to effective neural architectures would provide a systematic method for mapping language features (e.g., polymorphism, subtyping, inheritance, etc.) to architecture choices. Second, transformer models are widely used in deep learning architectures applied to code, but the design and hyperparameter space for them is large and relatively unexplored in programming language applications. Therefore, we suggest a benchmark that allows us to explore exactly this through perhaps the simplest and most fundamental property of a programming language: the relationship between terms and types. Consequently, we begin this inquiry of transformer architectures for typed lambda calculi by exploring the effect of transformer warm-up and optimizer selection in the task of type inference: i.e., predicting the types of lambda calculus terms using only transformers. We find that the optimization landscape is difficult even in this simple setting. One particular experimental finding is that optimization by Adafactor converges much faster compared to the optimization by Adam and RAdam. We conjecture that such different performance of optimizers might be related to the difficulties of generalization over formally generated dataset.

Autonomic Architecture for Big Data Performance Optimization

  • Authors: Mikhail Genkin, Frank Dehne, Anousheh Shahmirza, Pablo Navarro, Siyu Zhou
  • Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10503
  • Pdf link: https://arxiv.org/pdf/2304.10503
  • Abstract
    The big data software stack based on Apache Spark and Hadoop has become mission critical in many enterprises. Performance of Spark and Hadoop jobs depends on a large number of configuration settings. Manual tuning is expensive and brittle. There have been prior efforts to develop on-line and off-line automatic tuning approaches to make the big data stack less dependent on manual tuning. These, however, demonstrated only modest performance improvements with very simple, single-user workloads on small data sets. This paper presents KERMIT - the autonomic architecture for big data capable of automatically tuning Apache Spark and Hadoop on-line, and achieving performance results 30% faster than rule-of-thumb tuning by a human administrator and up to 92% as fast as the fastest possible tuning established by performing an exhaustive search of the tuning parameter space. KERMIT can detect important workload changes with up to 99% accuracy, and predict future workload types with up to 96% accuracy. It is capable of identifying and classifying complex multi-user workloads without being explicitly trained on examples of these workloads. It does not rely on the past workload history to predict the future workload classes and their associated performance. KERMIT can identify and learn new workload classes, and adapt to workload drift, without human intervention.

Generalizing Neural Human Fitting to Unseen Poses With Articulated SE(3) Equivariance

  • Authors: Haiwen Feng, Peter Kulits, Shichen Liu, Michael J. Black, Victoria Abrevaya
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10528
  • Pdf link: https://arxiv.org/pdf/2304.10528
  • Abstract
    We address the problem of fitting a parametric human body model (SMPL) to point cloud data. Optimization-based methods require careful initialization and are prone to becoming trapped in local optima. Learning-based methods address this but do not generalize well when the input pose is far from those seen during training. For rigid point clouds, remarkable generalization has been achieved by leveraging SE(3)-equivariant networks, but these methods do not work on articulated objects. In this work we extend this idea to human bodies and propose ArtEq, a novel part-based SE(3)-equivariant neural architecture for SMPL model estimation from point clouds. Specifically, we learn a part detection network by leveraging local SO(3) invariance, and regress shape and pose using articulated SE(3) shape-invariant and pose-equivariant networks, all trained end-to-end. Our novel equivariant pose regression module leverages the permutation-equivariant property of self-attention layers to preserve rotational equivariance. Experimental results show that ArtEq can generalize to poses not seen during training, outperforming state-of-the-art methods by 74.5%, without requiring an optimization refinement step. Further, compared with competing works, our method is more than three orders of magnitude faster during inference and has 97.3% fewer parameters. The code and model will be available for research purposes at https://arteq.is.tue.mpg.de.

Keyword: mobile

NRTS: A Client-Server architecture for supporting data recording, transmission and evaluation of multidisciplinary teams during the neonatal resuscitation simulation scenario

  • Authors: Manuel Striani
  • Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Networking and Internet Architecture (cs.NI)
  • Arxiv link: https://arxiv.org/abs/2304.09860
  • Pdf link: https://arxiv.org/pdf/2304.09860
  • Abstract
    In this technical report, we describe Neonatal Resuscitation Training Simulator (NRTS), an Android mobile app designed to support medical experts to input, transmit and record data during a High-Fidelity Simulation course for neonatal resuscitation. This mobile app allows one to automatically send all the recorded data from "Neonatal Intensive Care Unit" (NICU) of Casale Monferrato Children's Hospital, (Italy) to a server located at the Department of Science and Technological Innovation (DiSIT), University of Piemonte Orientale (Italy). Finally, the medical instructor can view statistics on a simulation exercise that may be used during the de-briefing phase for the evaluation of multidisciplinary teams involved in the simulation scenarios.

Scheduling DNNs on Edge Servers

  • Authors: Jian He, Chenxi Yang, Zhaoyuan He, Ghufran Baig, Lili Qiu
  • Subjects: Networking and Internet Architecture (cs.NI); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.09961
  • Pdf link: https://arxiv.org/pdf/2304.09961
  • Abstract
    Deep neural networks (DNNs) have been widely used in various video analytic tasks. These tasks demand real-time responses. Due to the limited processing power on mobile devices, a common way to support such real-time analytics is to offload the processing to an edge server. This paper examines how to speed up the edge server DNN processing for multiple clients. In particular, we observe batching multiple DNN requests significantly speeds up the processing time. Based on this observation, we first design a novel scheduling algorithm to exploit the batching benefits of all requests that run the same DNN. This is compelling since there are only a handful of DNNs and many requests tend to use the same DNN. Our algorithms are general and can support different objectives, such as minimizing the completion time or maximizing the on-time ratio. We then extend our algorithm to handle requests that use different DNNs with or without shared layers. Finally, we develop a collaborative approach to further improve performance by adaptively processing some of the requests or portions of the requests locally at the clients. This is especially useful when the network and/or server is congested. Our implementation shows the effectiveness of our approach under different request distributions (e.g., Poisson, Pareto, and Constant inter-arrivals).

Availability Model of a 5G-MEC System

  • Authors: Thilina Pathirana, Gianfranco Nencioni
  • Subjects: Networking and Internet Architecture (cs.NI)
  • Arxiv link: https://arxiv.org/abs/2304.09992
  • Pdf link: https://arxiv.org/pdf/2304.09992
  • Abstract
    Multi-access Edge Computing (MEC) is one of the enabling technologies of the fifth generation (5G) of mobile networks. MEC enables services with strict latency requirements by bringing computing capabilities close to the users. As with any new technology, the dependability of MEC is one of the aspects that need to be carefully studied. In this paper, we propose a two-level model to compute the availability of a 5G-MEC system. We then use the model to evaluate the availability of a 5G-MEC system under various configurations. The results show that having a single redundancy of the 5G-MEC elements leads an acceptable availability. To reach a high availability, the software failure intensity of the management elements of 5G and MEC should be reduced.

Robust Route Planning with Distributional Reinforcement Learning in a Stochastic Road Network Environment

  • Authors: Xi Lin, Paul Szenher, John D. Martin, Brendan Englot
  • Subjects: Robotics (cs.RO)
  • Arxiv link: https://arxiv.org/abs/2304.09996
  • Pdf link: https://arxiv.org/pdf/2304.09996
  • Abstract
    Route planning is essential to mobile robot navigation problems. In recent years, deep reinforcement learning (DRL) has been applied to learning optimal planning policies in stochastic environments without prior knowledge. However, existing works focus on learning policies that maximize the expected return, the performance of which can vary greatly when the level of stochasticity in the environment is high. In this work, we propose a distributional reinforcement learning based framework that learns return distributions which explicitly reflect environmental stochasticity. Policies based on the second-order stochastic dominance (SSD) relation can be used to make adjustable route decisions according to user preference on performance robustness. Our proposed method is evaluated in a simulated road network environment, and experimental results show that our method is able to plan the shortest routes that minimize stochasticity in travel time when robustness is preferred, while other state-of-the-art DRL methods are agnostic to environmental stochasticity.

FTMRate: Collision-Immune Distance-based Data Rate Selection for IEEE 802.11 Networks

  • Authors: Wojciech Ciezobka, Maksymilian Wojnar, Katarzyna Kosek-Szott, Szymon Szott, Krzysztof Rusek
  • Subjects: Networking and Internet Architecture (cs.NI)
  • Arxiv link: https://arxiv.org/abs/2304.10140
  • Pdf link: https://arxiv.org/pdf/2304.10140
  • Abstract
    Data rate selection algorithms for Wi-Fi devices are an important area of research because they directly impact performance. Most of the proposals are based on measuring the transmission success probability for a given data rate. In dense scenarios, however, this probing approach will fail because frame collisions are misinterpreted as erroneous data rate selection. We propose FTMRate which uses the fine timing measurement (FTM) feature, recently introduced in IEEE 802.11. FTM allows stations to measure their distance from the AP. We argue that knowledge of the distance from the receiver can be useful in determining which data rate to use. We apply statistical learning (a form of machine learning) to estimate the distance based on measurements, estimate channel quality from the distance, and select data rates based on channel quality. We evaluate three distinct estimation approaches: exponential smoothing, Kalman filter, and particle filter. We present a performance evaluation of the three variants of FTMRate and show, in several dense and mobile (though line-of-sight only) scenarios, that it can outperform two benchmarks and provide close to optimal results in IEEE 802.11ax networks.

A Large-scale Examination of "Socioeconomic" Fairness in Mobile Networks

  • Authors: Souneil Park, Pavol Mulinka, Diego Perino
  • Subjects: Computers and Society (cs.CY)
  • Arxiv link: https://arxiv.org/abs/2304.10190
  • Pdf link: https://arxiv.org/pdf/2304.10190
  • Abstract
    Internet access is a special resource of which needs has become universal across the public whereas the service is operated in the private sector. Mobile Network Operators (MNOs) put efforts for management, planning, and optimization; however, they do not link such activities to socioeconomic fairness. In this paper, we make a first step towards understanding the relation between socioeconomic status of customers and network performance, and investigate potential discrimination in network deployment and management. The scope of our study spans various aspects, including urban geography, network resource deployment, data consumption, and device distribution. A novel methodology that enables a geo-socioeconomic perspective to mobile network is developed for the study. The results are based on an actual infrastructure in multiple cities, covering millions of users densely covering the socioeconomic scale. We report a thorough examination of the fairness status, its relationship with various structural factors, and potential class specific solutions.

Breast cancer detection using deep learning

  • Authors: Gayathri Girish, Ponnathota Spandana, Badrish Vasu
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10386
  • Pdf link: https://arxiv.org/pdf/2304.10386
  • Abstract
    Objective: This paper proposes a deep learning model for breast cancer detection from reconstructed images of microwave imaging scan data and aims to improve the accuracy and efficiency of breast tumor detection, which could have a significant impact on breast cancer diagnosis and treatment. Methods: Our framework consists of different convolutional neural network (CNN) architectures for feature extraction and a region-based CNN for tumor detection. We use 7 different architectures: DenseNet201, ResNet50, InceptionV3, InceptionResNetV3, MobileNetV2, NASNetMobile and NASNetLarge and compare its performance to find the best architecture out of the seven. An experimental dataset of MRI-derived breast phantoms was used. Results: NASNetLarge is the best architecture which can be used for the CNN model with accuracy of 88.41% and loss of 27.82%. Given that the model's AUC is 0.786, it can be concluded that it is suitable for use in its present form, while it could be improved upon and trained on other datasets that are comparable. Impact: One of the main causes of death in women is breast cancer, and early identification is essential for enhancing the results for patients. Due to its non-invasiveness and capacity to produce high-resolution images, microwave imaging is a potential tool for breast cancer screening. The complexity of tumors makes it difficult to adequately detect them in microwave images. The results of this research show that deep learning has a lot of potential for breast cancer detection in microwave images

Securing Neural Networks with Knapsack Optimization

  • Authors: Yakir Gorski, Shai Avidan
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10442
  • Pdf link: https://arxiv.org/pdf/2304.10442
  • Abstract
    Deep learning inference brings together the data and the Convolutional Neural Network (CNN). This is problematic in case the user wants to preserve the privacy of the data and the service provider does not want to reveal the weights of his CNN. Secure Inference allows the two parties to engage in a protocol that preserves their respective privacy concerns, while revealing only the inference result to the user. This is known as Multi-Party Computation (MPC). A major bottleneck of MPC algorithms is communication, as the parties must send data back and forth. The linear component of a CNN (i.e. convolutions) can be done efficiently with minimal communication, but the non-linear part (i.e., ReLU) requires the bulk of communication bandwidth. We propose two ways to accelerate Secure Inference. The first is based on the observation that the ReLU outcome of many convolutions is highly correlated. Therefore, we replace the per pixel ReLU operation by a ReLU operation per patch. Each layer in the network will benefit from a patch of a different size and we devise an algorithm to choose the optimal set of patch sizes through a novel reduction of the problem to a knapsack problem. The second way to accelerate Secure Inference is based on cutting the number of bit comparisons required for a secure ReLU operation. We demonstrate the cumulative effect of these tools in the semi-honest secure 3-party setting for four problems: Classifying ImageNet using ResNet50 backbone, classifying CIFAR100 using ResNet18 backbone, semantic segmentation of ADE20K using MobileNetV2 backbone and semantic segmentation of Pascal VOC 2012 using ResNet50 backbone. Our source code is publicly available: $\href{https://github.com/yg320/secure_inference}{\text{https://github.com/yg320/secure_inference}}$

Keyword: pruning

Model Pruning Enables Localized and Efficient Federated Learning for Yield Forecasting and Data Sharing

  • Authors: Andy Li, Milan Markovic, Peter Edwards, Georgios Leontidis
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.09876
  • Pdf link: https://arxiv.org/pdf/2304.09876
  • Abstract
    Federated Learning (FL) presents a decentralized approach to model training in the agri-food sector and offers the potential for improved machine learning performance, while ensuring the safety and privacy of individual farms or data silos. However, the conventional FL approach has two major limitations. First, the heterogeneous data on individual silos can cause the global model to perform well for some clients but not all, as the update direction on some clients may hinder others after they are aggregated. Second, it is lacking with respect to the efficiency perspective concerning communication costs during FL and large model sizes. This paper proposes a new technical solution that utilizes network pruning on client models and aggregates the pruned models. This method enables local models to be tailored to their respective data distribution and mitigate the data heterogeneity present in agri-food data. Moreover, it allows for more compact models that consume less data during transmission. We experiment with a soybean yield forecasting dataset and find that this approach can improve inference performance by 15.5% to 20% compared to FedAvg, while reducing local model sizes by up to 84% and the data volume communicated between the clients and the server by 57.1% to 64.7%.

Keyword: voxel

Tetra-NeRF: Representing Neural Radiance Fields Using Tetrahedra

  • Authors: Jonas Kulhanek, Torsten Sattler
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.09987
  • Pdf link: https://arxiv.org/pdf/2304.09987
  • Abstract
    Neural Radiance Fields (NeRFs) are a very recent and very popular approach for the problems of novel view synthesis and 3D reconstruction. A popular scene representation used by NeRFs is to combine a uniform, voxel-based subdivision of the scene with an MLP. Based on the observation that a (sparse) point cloud of the scene is often available, this paper proposes to use an adaptive representation based on tetrahedra and a Delaunay representation instead of the uniform subdivision or point-based representations. We show that such a representation enables efficient training and leads to state-of-the-art results. Our approach elegantly combines concepts from 3D geometry processing, triangle-based rendering, and modern neural radiance fields. Compared to voxel-based representations, ours provides more detail around parts of the scene likely to be close to the surface. Compared to point-based representations, our approach achieves better performance.

Multiscale Representation for Real-Time Anti-Aliasing Neural Rendering

  • Authors: Dongting Hu, Zhenkai Zhang, Tingbo Hou, Tongliang Liu, Huan Fu, Mingming Gong
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10075
  • Pdf link: https://arxiv.org/pdf/2304.10075
  • Abstract
    The rendering scheme in neural radiance field (NeRF) is effective in rendering a pixel by casting a ray into the scene. However, NeRF yields blurred rendering results when the training images are captured at non-uniform scales, and produces aliasing artifacts if the test images are taken in distant views. To address this issue, Mip-NeRF proposes a multiscale representation as a conical frustum to encode scale information. Nevertheless, this approach is only suitable for offline rendering since it relies on integrated positional encoding (IPE) to query a multilayer perceptron (MLP). To overcome this limitation, we propose mip voxel grids (Mip-VoG), an explicit multiscale representation with a deferred architecture for real-time anti-aliasing rendering. Our approach includes a density Mip-VoG for scene geometry and a feature Mip-VoG with a small MLP for view-dependent color. Mip-VoG encodes scene scale using the level of detail (LOD) derived from ray differentials and uses quadrilinear interpolation to map a queried 3D location to its features and density from two neighboring downsampled voxel grids. To our knowledge, our approach is the first to offer multiscale training and real-time anti-aliasing rendering simultaneously. We conducted experiments on multiscale datasets, and the results show that our approach outperforms state-of-the-art real-time rendering baselines.

Keyword: lidar

LiDAR-NeRF: Novel LiDAR View Synthesis via Neural Radiance Fields

  • Authors: Tang Tao, Longfei Gao, Guangrun Wang, Peng Chen, Dayang Hao, Xiaodan Liang, Mathieu Salzmann, Kaicheng Yu
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10406
  • Pdf link: https://arxiv.org/pdf/2304.10406
  • Abstract
    We introduce a new task, novel view synthesis for LiDAR sensors. While traditional model-based LiDAR simulators with style-transfer neural networks can be applied to render novel views, they fall short in producing accurate and realistic LiDAR patterns, because the renderers they rely on exploit game engines, which are not differentiable. We address this by formulating, to the best of our knowledge, the first differentiable LiDAR renderer, and propose an end-to-end framework, LiDAR-NeRF, leveraging a neural radiance field (NeRF) to enable jointly learning the geometry and the attributes of 3D points. To evaluate the effectiveness of our approach, we establish an object-centric multi-view LiDAR dataset, dubbed NeRF-MVL. It contains observations of objects from 9 categories seen from 360-degree viewpoints captured with multiple LiDAR sensors. Our extensive experiments on the scene-level KITTI-360 dataset, and on our object-level NeRF-MVL show that our LiDAR- NeRF surpasses the model-based algorithms significantly.

Keyword: diffusion

Using Text-to-Image Generation for Architectural Design Ideation

  • Authors: Ville Paananen, Jonas Oppenlaender, Aku Visuri
  • Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10182
  • Pdf link: https://arxiv.org/pdf/2304.10182
  • Abstract
    The recent progress of text-to-image generation has been recognized in architectural design. Our study is the first to investigate the potential of text-to-image generators in supporting creativity during the early stages of the architectural design process. We conducted a laboratory study with 17 architecture students, who developed a concept for a culture center using three popular text-to-image generators: Midjourney, Stable Diffusion, and DALL-E. Through standardized questionnaires and group interviews, we found that image generation could be a meaningful part of the design process when design constraints are carefully considered. Generative tools support serendipitous discovery of ideas and an imaginative mindset, enriching the design process. We identified several challenges of image generators and provided considerations for software development and educators to support creativity and emphasize designers' imaginative mindset. By understanding the limitations and potential of text-to-image generators, architects and designers can leverage this technology in their design process and education, facilitating innovation and effective communication of concepts.

A data augmentation perspective on diffusion models and retrieval

  • Authors: Max F. Burg, Florian Wenzel, Dominik Zietlow, Max Horn, Osama Makansi, Francesco Locatello, Chris Russell
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.10253
  • Pdf link: https://arxiv.org/pdf/2304.10253
  • Abstract
    Diffusion models excel at generating photorealistic images from text-queries. Naturally, many approaches have been proposed to use these generative abilities to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large noisily supervised, but nonetheless, annotated datasets. It is an open question whether the generalization capabilities of diffusion models beyond using the additional data of the pre-training process for augmentation lead to improved downstream performance. We perform a systematic evaluation of existing methods to generate images from diffusion models and study new extensions to assess their benefit for data augmentation. While we find that personalizing diffusion models towards the target data outperforms simpler prompting strategies, we also show that using the training data of the diffusion model alone, via a simple nearest neighbor retrieval procedure, leads to even stronger downstream performance. Overall, our study probes the limitations of diffusion models for data augmentation but also highlights its potential in generating new training data to improve performance on simple downstream vision tasks.

Anything-3D: Towards Single-view Anything Reconstruction in the Wild

  • Authors: Qiuhong Shen, Xingyi Yang, Xinchao Wang
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10261
  • Pdf link: https://arxiv.org/pdf/2304.10261
  • Abstract
    3D reconstruction from a single-RGB image in unconstrained real-world scenarios presents numerous challenges due to the inherent diversity and complexity of objects and environments. In this paper, we introduce Anything-3D, a methodical framework that ingeniously combines a series of visual-language models and the Segment-Anything object segmentation model to elevate objects to 3D, yielding a reliable and versatile system for single-view conditioned 3D reconstruction task. Our approach employs a BLIP model to generate textural descriptions, utilizes the Segment-Anything model for the effective extraction of objects of interest, and leverages a text-to-image diffusion model to lift object into a neural radiance field. Demonstrating its ability to produce accurate and detailed 3D reconstructions for a wide array of objects, \emph{Anything-3D\footnotemark[2]} shows promise in addressing the limitations of existing methodologies. Through comprehensive experiments and evaluations on various datasets, we showcase the merits of our approach, underscoring its potential to contribute meaningfully to the field of 3D reconstruction. Demos and code will be available at \href{https://github.com/Anything-of-anything/Anything-3D}{https://github.com/Anything-of-anything/Anything-3D}.

Prediction of the evolution of the nuclear reactor core parameters using artificial neural network

  • Authors: Krzysztof Palmi, Wojciech Kubinski, Piotr Darnowski
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.10337
  • Pdf link: https://arxiv.org/pdf/2304.10337
  • Abstract
    A nuclear reactor based on MIT BEAVRS benchmark was used as a typical power generating Pressurized Water Reactor (PWR). The PARCS v3.2 nodal-diffusion core simulator was used as a full-core reactor physics solver to emulate the operation of a reactor and to generate training, and validation data for the ANN. The ANN was implemented with dedicated Python 3.8 code with Google's TensorFlow 2.0 library. The effort was based to a large extent on the process of appropriate automatic transformation of data generated by PARCS simulator, which was later used in the process of the ANN development. Various methods that allow obtaining better accuracy of the ANN predicted results were studied, such as trying different ANN architectures to find the optimal number of neurons in the hidden layers of the network. Results were later compared with the architectures proposed in the literature. For the selected best architecture predictions were made for different core parameters and their dependence on core loading patterns. In this study, a special focus was put on the prediction of the fuel cycle length for a given core loading pattern, as it can be considered one of the targets for plant economic operation. For instance, the length of a single fuel cycle depending on the initial core loading pattern was predicted with very good accuracy (>99%). This work contributes to the exploration of the usefulness of neural networks in solving nuclear reactor design problems. Thanks to the application of ANN, designers can avoid using an excessive amount of core simulator runs and more rapidly explore the space of possible solutions before performing more detailed design considerations.

Collaborative Diffusion for Multi-Modal Face Generation and Editing

  • Authors: Ziqi Huang, Kelvin C.K. Chan, Yuming Jiang, Ziwei Liu
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10530
  • Pdf link: https://arxiv.org/pdf/2304.10530
  • Abstract
    Diffusion models arise as a powerful generative tool recently. Despite the great progress, existing diffusion models mainly focus on uni-modal control, i.e., the diffusion process is driven by only one modality of condition. To further unleash the users' creativity, it is desirable for the model to be controllable by multiple modalities simultaneously, e.g., generating and editing faces by describing the age (text-driven) while drawing the face shape (mask-driven). In this work, we present Collaborative Diffusion, where pre-trained uni-modal diffusion models collaborate to achieve multi-modal face generation and editing without re-training. Our key insight is that diffusion models driven by different modalities are inherently complementary regarding the latent denoising steps, where bilateral connections can be established upon. Specifically, we propose dynamic diffuser, a meta-network that adaptively hallucinates multi-modal denoising steps by predicting the spatial-temporal influence functions for each pre-trained uni-modal model. Collaborative Diffusion not only collaborates generation capabilities from uni-modal diffusion models, but also integrates multiple uni-modal manipulations to perform multi-modal editing. Extensive qualitative and quantitative experiments demonstrate the superiority of our framework in both image quality and condition consistency.

Nerfbusters: Removing Ghostly Artifacts from Casually Captured NeRFs

  • Authors: Frederik Warburg, Ethan Weber, Matthew Tancik, Aleksander Holynski, Angjoo Kanazawa
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
  • Arxiv link: https://arxiv.org/abs/2304.10532
  • Pdf link: https://arxiv.org/pdf/2304.10532
  • Abstract
    Casually captured Neural Radiance Fields (NeRFs) suffer from artifacts such as floaters or flawed geometry when rendered outside the camera trajectory. Existing evaluation protocols often do not capture these effects, since they usually only assess image quality at every 8th frame of the training capture. To push forward progress in novel-view synthesis, we propose a new dataset and evaluation procedure, where two camera trajectories are recorded of the scene: one used for training, and the other for evaluation. In this more challenging in-the-wild setting, we find that existing hand-crafted regularizers do not remove floaters nor improve scene geometry. Thus, we propose a 3D diffusion-based method that leverages local 3D priors and a novel density-based score distillation sampling loss to discourage artifacts during NeRF optimization. We show that this data-driven prior removes floaters and improves scene geometry for casual captures.

Farm3D: Learning Articulated 3D Animals by Distilling 2D Diffusion

  • Authors: Tomas Jakab, Ruining Li, Shangzhe Wu, Christian Rupprecht, Andrea Vedaldi
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10535
  • Pdf link: https://arxiv.org/pdf/2304.10535
  • Abstract
    We present Farm3D, a method to learn category-specific 3D reconstructors for articulated objects entirely from "free" virtual supervision from a pre-trained 2D diffusion-based image generator. Recent approaches can learn, given a collection of single-view images of an object category, a monocular network to predict the 3D shape, albedo, illumination and viewpoint of any object occurrence. We propose a framework using an image generator like Stable Diffusion to generate virtual training data for learning such a reconstruction network from scratch. Furthermore, we include the diffusion model as a score to further improve learning. The idea is to randomise some aspects of the reconstruction, such as viewpoint and illumination, generating synthetic views of the reconstructed 3D object, and have the 2D network assess the quality of the resulting image, providing feedback to the reconstructor. Different from work based on distillation which produces a single 3D asset for each textual prompt in hours, our approach produces a monocular reconstruction network that can output a controllable 3D asset from a given image, real or generated, in only seconds. Our network can be used for analysis, including monocular reconstruction, or for synthesis, generating articulated assets for real-time applications such as video games.

Keyword: dynamic

GeoGraphViz: Geographically Constrained 3D Force-Directed Graph for Knowledge Graph Visualization

  • Authors: Sizhe Wang, Wenwen Li, Zhining Gu
  • Subjects: Human-Computer Interaction (cs.HC)
  • Arxiv link: https://arxiv.org/abs/2304.09864
  • Pdf link: https://arxiv.org/pdf/2304.09864
  • Abstract
    Knowledge graphs are a key technique for linking and integrating cross-domain data, concepts, tools, and knowledge to enable data-driven analytics. As much of the worlds data have become massive in size, visualizing graph entities and their interrelationships intuitively and interactively has become a crucial task for ingesting and better utilizing graph content to support semantic reasoning, discovering hidden knowledge discovering, and better scientific understanding of geophysical and social phenomena. Despite the fact that many such phenomena (e.g., disasters) have clear spatial footprints and geographical properties, their location information is considered only as a textual label in existing graph visualization tools, limiting their capability to reveal the geospatial distribution patterns of the graph nodes. In addition, most graph visualization techniques rely on 2D graph visualization, which constraints the dimensions of information that can be presented and lacks support for graph structure examination from multiple angles. To tackle the above challenges, we developed a novel 3D map-based graph visualization algorithm to enable interactive exploration of graph content and patterns in a spatially explicit manner. The algorithm extends a 3D force directed graph by integrating a web map, an additional geolocational force, and a force balancing variable that allows for the dynamic adjustment of the 3D graph structure and layout. This mechanism helps create a balanced graph view between the semantic forces among the graph nodes and the attractive force from a geolocation to a graph node. Our solution offers a new perspective in visualizing and understanding spatial entities and events in a knowledge graph.

Robust trajectory tracking for underactuated mechanical systems without velocity measurements

  • Authors: N. Javanmardi, P. Borja, M. J. Yazdanpanah, J. M. A. Scherpen
  • Subjects: Systems and Control (eess.SY)
  • Arxiv link: https://arxiv.org/abs/2304.09910
  • Pdf link: https://arxiv.org/pdf/2304.09910
  • Abstract
    In this paper, the notion of contraction is used to solve the trajectory-tracking problem for a class of mechanical systems. Additionally, we propose a dynamic extension to remove velocity measurements from the controller while rejecting matched disturbances. In particular, we propose three control designs stemming from the Interconnection and Damping Assignment Passivity-Based Control approach. The first controller is a tracker that does not require velocity measurements. The second control design solves the trajectory-tracking problem while guaranteeing robustness with respect to matched disturbances. Then, the third approach is a combination of both mentioned controllers. It is shown that all proposed design methods guarantee exponential convergence of the mechanical system to the desired (feasible) trajectory due to the contraction property of the closed-loop system. The applicability of this method is illustrated via the design of a controller for an underactuated mechanical system.

An Intent-based Framework for Vehicular Edge Computing

  • Authors: TianZhang He, Adel N. Toosi, Negin Akbari, Muhammed Tawfiqul Islam, Muhammad Aamir Cheema
  • Subjects: Networking and Internet Architecture (cs.NI); Distributed, Parallel, and Cluster Computing (cs.DC)
  • Arxiv link: https://arxiv.org/abs/2304.09916
  • Pdf link: https://arxiv.org/pdf/2304.09916
  • Abstract
    The rapid development of emerging vehicular edge computing (VEC) brings new opportunities and challenges for dynamic resource management. The increasing number of edge data centers, roadside units (RSUs), and network devices, however, makes resource management a complex task in VEC. On the other hand, the exponential growth of service applications and end-users makes corresponding QoS hard to maintain. Intent-Based Networking (IBN), based on Software-Defined Networking, was introduced to provide the ability to automatically handle and manage the networking requirements of different applications. Motivated by the IBN concept, in this paper, we propose a novel approach to jointly orchestrate networking and computing resources based on user requirements. The proposed solution constantly monitors user requirements and dynamically re-configures the system to satisfy desired states of the application. We compared our proposed solution with the state-of-the-art networking embedding algorithms using real-world taxi GPS traces. Results show that our proposed method is significantly faster (up to 95%) and can improve resource utilization (up to 76%) and the acceptance ratio of computing and networking requests with various priorities (up to 71%). We also present a small-scale prototype of the proposed intent management framework to validate our solution.

Improving Urban Flood Prediction using LSTM-DeepLabv3+ and Bayesian Optimization with Spatiotemporal feature fusion

  • Authors: Zuxiang Situ, Qi Wang, Shuai Teng, Wanen Feng, Gongfa Chen, Qianqian Zhou, Guangtao Fu
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.09994
  • Pdf link: https://arxiv.org/pdf/2304.09994
  • Abstract
    Deep learning models have become increasingly popular for flood prediction due to their superior accuracy and efficiency compared to traditional methods. However, current machine learning methods often rely on separate spatial or temporal feature analysis and have limitations on the types, number, and dimensions of input data. This study presented a CNN-RNN hybrid feature fusion modelling approach for urban flood prediction, which integrated the strengths of CNNs in processing spatial features and RNNs in analyzing different dimensions of time sequences. This approach allowed for both static and dynamic flood predictions. Bayesian optimization was applied to identify the seven most influential flood-driven factors and determine the best combination strategy. By combining four CNNs (FCN, UNet, SegNet, DeepLabv3+) and three RNNs (LSTM, BiLSTM, GRU), the optimal hybrid model was identified as LSTM-DeepLabv3+. This model achieved the highest prediction accuracy (MAE, RMSE, NSE, and KGE were 0.007, 0.025, 0.973 and 0.755, respectively) under various rainfall input conditions. Additionally, the processing speed was significantly improved, with an inference time of 1.158s (approximately 1/125 of the traditional computation time) compared to the physically-based models.

HTNet: Dynamic WLAN Performance Prediction using Heterogenous Temporal GNN

  • Authors: Hongkuan Zhou, Rajgopal Kannan, Ananthram Swami, Viktor Prasanna
  • Subjects: Networking and Internet Architecture (cs.NI)
  • Arxiv link: https://arxiv.org/abs/2304.10013
  • Pdf link: https://arxiv.org/pdf/2304.10013
  • Abstract
    Predicting the throughput of WLAN deployments is a classic problem that occurs in the design of robust and high performance WLAN systems. However, due to the increasingly complex communication protocols and the increase in interference between devices in denser and denser WLAN deployments, traditional methods either have substantial runtime or enormous prediction error and hence cannot be applied in downstream tasks. Recently, Graph Neural Networks have been proven to be powerful graph analytic models and have been broadly applied to various networking problems such as link scheduling and power allocation. In this work, we propose HTNet, a specialized Heterogeneous Temporal Graph Neural Network that extracts features from dynamic WLAN deployments. Analyzing the unique graph structure of WLAN deployment graphs, we show that HTNet achieves the maximum expressive power on each snapshot. Based on a powerful message passing scheme, HTNet requires fewer number of layers compared with other GNN-based methods which entails less supporting data and runtime. To evaluate the performance of HTNet, we prepare six different setups with more than five thousands dense dynamic WLAN deployments that cover a wide range of real-world scenarios. HTNet achieves the lowest prediction error on all six setups with an average improvement of 25.3% over the state-of-the-art methods.

Topological Guided Actor-Critic Modular Learning of Continuous Systems with Temporal Objectives

  • Authors: Lening Li, Zhentian Qian
  • Subjects: Artificial Intelligence (cs.AI); Optimization and Control (math.OC)
  • Arxiv link: https://arxiv.org/abs/2304.10041
  • Pdf link: https://arxiv.org/pdf/2304.10041
  • Abstract
    This work investigates the formal policy synthesis of continuous-state stochastic dynamic systems given high-level specifications in linear temporal logic. To learn an optimal policy that maximizes the satisfaction probability, we take a product between a dynamic system and the translated automaton to construct a product system on which we solve an optimal planning problem. Since this product system has a hybrid product state space that results in reward sparsity, we introduce a generalized optimal backup order, in reverse to the topological order, to guide the value backups and accelerate the learning process. We provide the optimality proof for using the generalized optimal backup order in this optimal planning problem. Further, this paper presents an actor-critic reinforcement learning algorithm when topological order applies. This algorithm leverages advanced mathematical techniques and enjoys the property of hyperparameter self-tuning. We provide proof of the optimality and convergence of our proposed reinforcement learning algorithm. We use neural networks to approximate the value function and policy function for hybrid product state space. Furthermore, we observe that assigning integer numbers to automaton states can rank the value or policy function approximated by neural networks. To break the ordinal relationship, we use an individual neural network for each automaton state's value (policy) function, termed modular learning. We conduct two experiments. First, to show the efficacy of our reinforcement learning algorithm, we compare it with baselines on a classic control task, CartPole. Second, we demonstrate the empirical performance of our formal policy synthesis framework on motion planning of a Dubins car with a temporal specification.

Dynablox: Real-time Detection of Diverse Dynamic Objects in Complex Environments

  • Authors: Lukas Schmid, Olov Andersson, Aurelio Sulser, Patrick Pfreundschuh, Roland Siegwart
  • Subjects: Robotics (cs.RO)
  • Arxiv link: https://arxiv.org/abs/2304.10049
  • Pdf link: https://arxiv.org/pdf/2304.10049
  • Abstract
    Real-time detection of moving objects is an essential capability for robots acting autonomously in dynamic environments. We thus propose Dynablox, a novel online mapping-based approach for robust moving object detection in complex environments. The central idea of our approach is to incrementally estimate high confidence free-space areas by modeling and accounting for sensing, state estimation, and mapping limitations during online robot operation. The spatio-temporally conservative free space estimate enables robust detection of moving objects without making any assumptions on the appearance of objects or environments. This allows deployment in complex scenes such as multi-storied buildings or staircases, and for diverse moving objects such as people carrying various items, doors swinging or even balls rolling around. We thoroughly evaluate our approach on real-world data sets, achieving 86% IoU at 17 FPS in typical robotic settings. The method outperforms a recent appearance-based classifier and approaches the performance of offline methods. We demonstrate its generality on a novel data set with rare moving objects in complex environments. We make our efficient implementation and the novel data set available as open-source.

Recurrent Transformer for Dynamic Graph Representation Learning with Edge Temporal States

  • Authors: Shengxiang Hu, Guobing Zou, Shiyi Lin, Liangrui Wu, Chenyang Zhou, Bofeng Zhang, Yixin Chen
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.10079
  • Pdf link: https://arxiv.org/pdf/2304.10079
  • Abstract
    Dynamic graph representation learning is growing as a trending yet challenging research task owing to the widespread demand for graph data analysis in real world applications. Despite the encouraging performance of many recent works that build upon recurrent neural networks (RNNs) and graph neural networks (GNNs), they fail to explicitly model the impact of edge temporal states on node features over time slices. Additionally, they are challenging to extract global structural features because of the inherent over-smoothing disadvantage of GNNs, which further restricts the performance. In this paper, we propose a recurrent difference graph transformer (RDGT) framework, which firstly assigns the edges in each snapshot with various types and weights to illustrate their specific temporal states explicitly, then a structure-reinforced graph transformer is employed to capture the temporal node representations by a recurrent learning paradigm. Experimental results on four real-world datasets demonstrate the superiority of RDGT for discrete dynamic graph representation learning, as it consistently outperforms competing methods in dynamic link prediction tasks.

Securing Semantic Communications with Physical-layer Semantic Encryption and Obfuscation

  • Authors: Qi Qin, Yankai Rong, Guoshun Nan, Shaokang Wu, Xuefei Zhang, Qimei Cui, Xiaofeng Tao
  • Subjects: Cryptography and Security (cs.CR)
  • Arxiv link: https://arxiv.org/abs/2304.10147
  • Pdf link: https://arxiv.org/pdf/2304.10147
  • Abstract
    Deep learning based semantic communication(DLSC) systems have shown great potential of making wireless networks significantly more efficient by only transmitting the semantics of the data. However, the open nature of wireless channel and fragileness of neural models cause DLSC systems extremely vulnerable to various attacks. Traditional wireless physical layer key (PLK), which relies on reciprocal channel and randomness characteristics between two legitimate users, holds the promise of securing DLSC. The main challenge lies in generating secret keys in the static environment with ultra-low/zero rate. Different from prior efforts that use relays or reconfigurable intelligent surfaces (RIS) to manipulate wireless channels, this paper proposes a novel physical layer semantic encryption scheme by exploring the randomness of bilingual evaluation understudy (BLEU) scores in the field of machine translation, and additionally presents a novel semantic obfuscation mechanism to provide further physical layer protections. Specifically, 1) we calculate the BLEU scores and corresponding weights of the DLSC system. Then, we generate semantic keys (SKey) by feeding the weighted sum of the scores into a hash function. 2) Equipped with the SKey, our proposed subcarrier obfuscation is able to further secure semantic communications with a dynamic dummy data insertion mechanism. Experiments show the effectiveness of our method, especially in the static wireless environment.

Automated Dynamic Bayesian Networks for Predicting Acute Kidney Injury Before Onset

  • Authors: David Gordon, Panayiotis Petousis, Anders O. Garlid, Keith Norris, Katherine Tuttle, Susanne B. Nicholas, Alex A.T. Bui (on behalf of CURE-CKD)
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10175
  • Pdf link: https://arxiv.org/pdf/2304.10175
  • Abstract
    Several algorithms for learning the structure of dynamic Bayesian networks (DBNs) require an a priori ordering of variables, which influences the determined graph topology. However, it is often unclear how to determine this order if feature importance is unknown, especially as an exhaustive search is usually impractical. In this paper, we introduce Ranking Approaches for Unknown Structures (RAUS), an automated framework to systematically inform variable ordering and learn networks end-to-end. RAUS leverages existing statistical methods (Cramers V, chi-squared test, and information gain) to compare variable ordering, resultant generated network topologies, and DBN performance. RAUS enables end-users with limited DBN expertise to implement models via command line interface. We evaluate RAUS on the task of predicting impending acute kidney injury (AKI) from inpatient clinical laboratory data. Longitudinal observations from 67,460 patients were collected from our electronic health record (EHR) and Kidney Disease Improving Global Outcomes (KDIGO) criteria were then applied to define AKI events. RAUS learns multiple DBNs simultaneously to predict a future AKI event at different time points (i.e., 24-, 48-, 72-hours in advance of AKI). We also compared the results of the learned AKI prediction models and variable orderings to baseline techniques (logistic regression, random forests, and extreme gradient boosting). The DBNs generated by RAUS achieved 73-83% area under the receiver operating characteristic curve (AUCROC) within 24-hours before AKI; and 71-79% AUCROC within 48-hours before AKI of any stage in a 7-day observation window. Insights from this automated framework can help efficiently implement and interpret DBNs for clinical decision support. The source code for RAUS is available in GitHub at https://github.com/dgrdn08/RAUS .

UAV-based Receding Horizon Control for 3D Inspection Planning

  • Authors: Savvas Papaioannou, Panayiotis Kolios, Theocharis Theocharides, Christos G. Panayiotou, Marios M. Polycarpou
  • Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
  • Arxiv link: https://arxiv.org/abs/2304.10201
  • Pdf link: https://arxiv.org/pdf/2304.10201
  • Abstract
    Nowadays, unmanned aerial vehicles or UAVs are being used for a wide range of tasks, including infrastructure inspection, automated monitoring and coverage. This paper investigates the problem of 3D inspection planning with an autonomous UAV agent which is subject to dynamical and sensing constraints. We propose a receding horizon 3D inspection planning control approach for generating optimal trajectories which enable an autonomous UAV agent to inspect a finite number of feature-points scattered on the surface of a cuboid-like structure of interest. The inspection planning problem is formulated as a constrained open-loop optimal control problem and is solved using mixed integer programming (MIP) optimization. Quantitative and qualitative evaluation demonstrates the effectiveness of the proposed approach.

Dynamic Security Region of Natural Gas Systems in Integrated Electricity-Gas Systems

  • Authors: Han Gao, Peiyao Zhao, Zhengshuo Li
  • Subjects: Systems and Control (eess.SY)
  • Arxiv link: https://arxiv.org/abs/2304.10215
  • Pdf link: https://arxiv.org/pdf/2304.10215
  • Abstract
    In an integrated electricity-gas system (IEGS), the tight coupling of power and natural gas systems is embodied by frequent changes in gas withdrawal from gas-fired units to provide regulation services for the power system to handle uncertainty, which may in turn endanger the secure operation of the natural gas system and ultimately affect the safety of the whole IEGS. Hence, it is necessary to accurately and efficiently evaluate the dynamic security region (DSR) of the natural gas system in the IEGS by considering the real-time dynamic characteristics of natural gas systems, which are not satisfactorily handled in state-of-the-art works. To bridge this gap, this paper first conceptionally verifies the necessity of the DSR and establishes its mathematical model. Then, a dimensionality reduction method is proposed for the efficient solution and visualization of the high-dimensional DSR evaluation model. A fast evaluation (FE) algorithm is developed to address the difficulties of the nonconvex dynamic constraints in the reduced DSR model. Finally, the necessity and notable advantages of the proposed DSR model and FE are verified based on small and relatively large test systems in comparison with common security region models and algorithms. To the best of our knowledge, this is the first paper that comprehensively presents models and efficient algorithms regarding the DSR of natural gas systems in an IEGS.

Filter-Aware Model-Predictive Control

  • Authors: Baris Kayalibay, Atanas Mirchev, Ahmed Agha, Patrick van der Smagt, Justin Bayer
  • Subjects: Machine Learning (cs.LG); Robotics (cs.RO); Systems and Control (eess.SY)
  • Arxiv link: https://arxiv.org/abs/2304.10246
  • Pdf link: https://arxiv.org/pdf/2304.10246
  • Abstract
    Partially-observable problems pose a trade-off between reducing costs and gathering information. They can be solved optimally by planning in belief space, but that is often prohibitively expensive. Model-predictive control (MPC) takes the alternative approach of using a state estimator to form a belief over the state, and then plan in state space. This ignores potential future observations during planning and, as a result, cannot actively increase or preserve the certainty of its own state estimate. We find a middle-ground between planning in belief space and completely ignoring its dynamics by only reasoning about its future accuracy. Our approach, filter-aware MPC, penalises the loss of information by what we call "trackability", the expected error of the state estimator. We show that model-based simulation allows condensing trackability into a neural network, which allows fast planning. In experiments involving visual navigation, realistic every-day environments and a two-link robot arm, we show that filter-aware MPC vastly improves regular MPC.

Learning Representative Trajectories of Dynamical Systems via Domain-Adaptive Imitation

  • Authors: Edgardo Solano-Carrillo, Jannis Stoppe
  • Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
  • Arxiv link: https://arxiv.org/abs/2304.10260
  • Pdf link: https://arxiv.org/pdf/2304.10260
  • Abstract
    Domain-adaptive trajectory imitation is a skill that some predators learn for survival, by mapping dynamic information from one domain (their speed and steering direction) to a different domain (current position of the moving prey). An intelligent agent with this skill could be exploited for a diversity of tasks, including the recognition of abnormal motion in traffic once it has learned to imitate representative trajectories. Towards this direction, we propose DATI, a deep reinforcement learning agent designed for domain-adaptive trajectory imitation using a cycle-consistent generative adversarial method. Our experiments on a variety of synthetic families of reference trajectories show that DATI outperforms baseline methods for imitation learning and optimal control in this setting, keeping the same per-task hyperparameters. Its generalization to a real-world scenario is shown through the discovery of abnormal motion patterns in maritime traffic, opening the door for the use of deep reinforcement learning methods for spatially-unconstrained trajectory data mining.

BackCache: Mitigating Contention-Based Cache Timing Attacks by Hiding Cache Line Evictions

  • Authors: Quancheng Wang, Ming Tang, Han Wang, Yuzhe Gu
  • Subjects: Cryptography and Security (cs.CR); Hardware Architecture (cs.AR)
  • Arxiv link: https://arxiv.org/abs/2304.10268
  • Pdf link: https://arxiv.org/pdf/2304.10268
  • Abstract
    Caches are used to reduce the speed differential between the CPU and memory to improve the performance of modern processors. However, attackers can use contention-based cache timing attacks to steal sensitive information from victim processes through carefully designed cache eviction sets. And L1 data cache attacks are widely exploited and pose a significant privacy and confidentiality threat. Existing hardware-based countermeasures mainly focus on cache partitioning, randomization, and cache line flushing, which unfortunately either incur high overhead or can be circumvented by sophisticated attacks. In this paper, we propose a novel hardware-software co-design called BackCache with the idea of always achieving cache hits instead of cache misses to mitigate contention-based cache timing attacks on the L1 data cache. BackCache places the evicted cache lines from the L1 data cache into a fully-associative backup cache to hide the evictions. To improve the security of BackCache, we introduce a randomly used replacement policy (RURP) and a dynamic backup cache resizing mechanism. We also present a theoretical security analysis to demonstrate the effectiveness of BackCache. Our evaluation on the gem5 simulator shows that BackCache can degrade the performance by 1.33%, 7.34%, and 7.59% For OS kernel, single-thread, and multi-thread benchmarks.

Observer-Feedback-Feedforward Controller Structures in Reinforcement Learning

  • Authors: Ruoqi Zhang, Per Mattson, Torbjörn Wigren
  • Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.10276
  • Pdf link: https://arxiv.org/pdf/2304.10276
  • Abstract
    The paper proposes the use of structured neural networks for reinforcement learning based nonlinear adaptive control. The focus is on partially observable systems, with separate neural networks for the state and feedforward observer and the state feedback and feedforward controller. The observer dynamics are modelled by recurrent neural networks while a standard network is used for the controller. As discussed in the paper, this leads to a separation of the observer dynamics to the recurrent neural network part, and the state feedback to the feedback and feedforward network. The structured approach reduces the computational complexity and gives the reinforcement learning based controller an {\em understandable} structure as compared to when one single neural network is used. As shown by simulation the proposed structure has the additional and main advantage that the training becomes significantly faster. Two ways to include feedforward structure are presented, one related to state feedback control and one related to classical feedforward control. The latter method introduces further structure with a separate recurrent neural network that processes only the measured disturbance. When evaluated with simulation on a nonlinear cascaded double tank process, the method with most structure performs the best, with excellent feedforward disturbance rejection gains.

Aiding reinforcement learning for set point control

  • Authors: Ruoqi Zhang, Per Mattsson, Torbjörn Wigren
  • Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.10289
  • Pdf link: https://arxiv.org/pdf/2304.10289
  • Abstract
    While reinforcement learning has made great improvements, state-of-the-art algorithms can still struggle with seemingly simple set-point feedback control problems. One reason for this is that the learned controller may not be able to excite the system dynamics well enough initially, and therefore it can take a long time to get data that is informative enough to learn for good control. The paper contributes by augmentation of reinforcement learning with a simple guiding feedback controller, for example, a proportional controller. The key advantage in set point control is a much improved excitation that improves the convergence properties of the reinforcement learning controller significantly. This can be very important in real-world control where quick and accurate convergence is needed. The proposed method is evaluated with simulation and on a real-world double tank process with promising results.

FIANCEE: Faster Inference of Adversarial Networks via Conditional Early Exits

  • Authors: Polina Karpikova (1 and 2), Radionova Ekaterina (1), Anastasia Yaschenko (1 and 2), Andrei Spiridonov (1), Leonid Kostyushko (3), Riccardo Fabbricatore (1), Aleksei Ivakhnenko (1) ((1) Samsung AI Center, (2) Higher School of Economics, (3) Lomonosov Moscow State University)
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10306
  • Pdf link: https://arxiv.org/pdf/2304.10306
  • Abstract
    Generative DNNs are a powerful tool for image synthesis, but they are limited by their computational load. On the other hand, given a trained model and a task, e.g. faces generation within a range of characteristics, the output image quality will be unevenly distributed among images with different characteristics. It follows, that we might restrain the models complexity on some instances, maintaining a high quality. We propose a method for diminishing computations by adding so-called early exit branches to the original architecture, and dynamically switching the computational path depending on how difficult it will be to render the output. We apply our method on two different SOTA models performing generative tasks: generation from a semantic map, and cross-reenactment of face expressions; showing it is able to output images with custom lower-quality thresholds. For a threshold of LPIPS <=0.1, we diminish their computations by up to a half. This is especially relevant for real-time applications such as synthesis of faces, when quality loss needs to be contained, but most of the inputs need fewer computations than the complex instances.

ORIGAMI: A flexible state channels design for public blockchain systems

  • Authors: Lydia Negka, Angeliki Katsika, Georgios Spathoulas, Vassilis Plagianakos
  • Subjects: Cryptography and Security (cs.CR)
  • Arxiv link: https://arxiv.org/abs/2304.10313
  • Pdf link: https://arxiv.org/pdf/2304.10313
  • Abstract
    Public blockchain systems offer security guarantees that cannot be matched by any centralised system. This offering has attracted a lot of interest and has exposed a significant limitation of most blockchain designs with regards to scalability. One of the scaling solutions proposed is state channels which enables serving given applications with minimum number of transactions. Existing state channels designs set multiple compatibility requirements for applications to be deployed. Origami is a novel state channels design which removes most of the requirements of existing approaches, while it also offers a number of new features. Origami enables dynamic groups of users to interact in an unordered way completely off-chain after an initial on-boarding on-chain transaction. The proposed design is analysed in detail and compared to existing schemes, while a formal security analysis validates the security properties it offers.

Polylog-Competitive Algorithms for Dynamic Balanced Graph Partitioning for Ring Demands

  • Authors: Harald Räcke, Stefan Schmid, Ruslan Zabrodin
  • Subjects: Data Structures and Algorithms (cs.DS)
  • Arxiv link: https://arxiv.org/abs/2304.10350
  • Pdf link: https://arxiv.org/pdf/2304.10350
  • Abstract
    The performance of many large-scale and data-intensive distributed systems critically depends on the capacity of the interconnecting network. This paper is motivated by the vision of self-adjusting infrastructures whose resources can be adjusted according to the workload they currently serve, in a demand-aware manner. Such dynamic adjustments can be exploited to improve network utilization and hence performance, by dynamically moving frequently interacting communication partners closer, e.g., collocating them in the same server or datacenter rack. In particular, we revisit the online balanced graph partitioning problem which captures the fundamental tradeoff between the benefits and costs of dynamically collocating communication partners. The demand is modelled as a sequence $\sigma$ (revealed in an online manner) of communication requests between $n$ processes, each of which is running on one of the $\ell$ servers. Each server has capacity $k=n/\ell$, hence, the processes have to be scheduled in a balanced manner across the servers. A request incurs cost $1$, if the requested processes are located on different servers, otherwise the cost is 0. A process can be migrated to a different server at cost $1$. This paper presents the first online algorithm for online balanced graph partitioning achieving a polylogarithmic competitive ratio for the fundamental case of ring communication patterns. Specifically, our main contribution is a $O(\log^3 n)$-competitive randomized online algorithm for this problem. We further present a randomized online algorithm which is $O(\log^2 n)$-competitive when compared to a static optimal solution. Our two results rely on different algorithms and techniques and hence are of independent interest.

PDL on Steroids: on Expressive Extensions of PDL with Intersection and Converse

  • Authors: Diego Figueira, Santiago Figueira, Edwin Pin
  • Subjects: Logic in Computer Science (cs.LO); Artificial Intelligence (cs.AI); Databases (cs.DB)
  • Arxiv link: https://arxiv.org/abs/2304.10381
  • Pdf link: https://arxiv.org/pdf/2304.10381
  • Abstract
    We introduce CPDL+, a family of expressive logics rooted in Propositional Dynamic Logic (PDL). In terms of expressive power, CPDL+ strictly contains PDL extended with intersection and converse (a.k.a. ICPDL) as well as Conjunctive Queries (CQ), Conjunctive Regular Path Queries (CRPQ), or some known extensions thereof (Regular Queries and CQPDL). We investigate the expressive power, characterization of bisimulation, satisfiability, and model checking for CPDL+. We argue that natural subclasses of CPDL+ can be defined in terms of the tree-width of the underlying graphs of the formulas. We show that the class of CPDL+ formulas of tree-width 2 is equivalent to ICPDL, and that it also coincides with CPDL+ formulas of tree-width 1. However, beyond tree-width 2, incrementing the tree-width strictly increases the expressive power. We characterize the expressive power for every class of fixed tree-width formulas in terms of a bisimulation game with pebbles. Based on this characterization, we show that CPDL+ has a tree-like model property. We prove that the satisfiability problem is decidable in 2ExpTime on fixed tree-width formulas, coinciding with the complexity of ICPDL. We also exhibit classes for which satisfiability is reduced to ExpTime. Finally, we establish that the model checking problem for fixed tree-width formulas is in \ptime, contrary to the full class CPDL+.

Multi-label Node Classification On Graph-Structured Data

  • Authors: Tianqi Zhao, Ngan Thi Dong, Alan Hanjalic, Megha Khosla
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2304.10398
  • Pdf link: https://arxiv.org/pdf/2304.10398
  • Abstract
    Graph Neural Networks (GNNs) have shown state-of-the-art improvements in node classification tasks on graphs. While these improvements have been largely demonstrated in a multi-class classification scenario, a more general and realistic scenario in which each node could have multiple labels has so far received little attention. The first challenge in conducting focused studies on multi-label node classification is the limited number of publicly available multi-label graph datasets. Therefore, as our first contribution, we collect and release three real-world biological datasets and develop a multi-label graph generator to generate datasets with tunable properties. While high label similarity (high homophily) is usually attributed to the success of GNNs, we argue that a multi-label scenario does not follow the usual semantics of homophily and heterophily so far defined for a multi-class scenario. As our second contribution, besides defining homophily for the multi-label scenario, we develop a new approach that dynamically fuses the feature and label correlation information to learn label-informed representations. Finally, we perform a large-scale comparative study with $10$ methods and $9$ datasets which also showcase the effectiveness of our approach. We release our benchmark at \url{https://anonymous.4open.science/r/LFLF-5D8C/}.

Distributed Neural Representation for Reactive in situ Visualization

  • Authors: Qi Wu, Joseph A. Insley, Victor A. Mateevitsi, Silvio Rizzi, Michael E. Papka, Kwan-Liu Ma
  • Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2304.10516
  • Pdf link: https://arxiv.org/pdf/2304.10516
  • Abstract
    In situ visualization and steering of computational modeling can be effectively achieved using reactive programming, which leverages temporal abstraction and data caching mechanisms to create dynamic workflows. However, implementing a temporal cache for large-scale simulations can be challenging. Implicit neural networks have proven effective in compressing large volume data. However, their application to distributed data has yet to be fully explored. In this work, we develop an implicit neural representation for distributed volume data and incorporate it into the DIVA reactive programming system. This implementation enables us to build an in situ temporal caching system with a capacity 100 times larger than previously achieved. We integrate our implementation into the Ascent infrastructure and evaluate its performance using real-world simulations.

A class of mesh-free algorithms for some problems arising in finance and machine learning

  • Authors: Philippe G. LeFloch, Jean-Marc Mercier
  • Subjects: Numerical Analysis (math.NA)
  • Arxiv link: https://arxiv.org/abs/2304.10521
  • Pdf link: https://arxiv.org/pdf/2304.10521
  • Abstract
    We introduce a numerical methodology, referred to as the transport-based mesh-free method, which allows us to deal with continuous, discrete, or statistical models in the same unified framework, and leads us to a broad class of numerical algorithms recently implemented in a Python library (namely, CodPy). Specifically, we propose a mesh-free discretization technique based on the theory of reproducing kernels and the theory of transport mappings, in a way that is reminiscent of Lagrangian methods in computational fluid dynamics. We introduce kernel-based discretizations of a variety of differential and discrete operators (gradient, divergence, Laplacian, Leray projection, extrapolation, interpolation, polar factorization). The proposed algorithms are nonlinear in nature and enjoy quantitative error estimates based on the notion of discrepancy error, which allows one to evaluate the relevance and accuracy of, both, the given data and the numerical solutions. Our strategy is relevant when a large number of degrees of freedom are present as is the case in mathematical finance and machine learning. We consider the Fokker-Planck-Kolmogorov system (relevant for problems arising in finance and material dynamics) and a class of neural networks based on support vector machines.

Collaborative Diffusion for Multi-Modal Face Generation and Editing

  • Authors: Ziqi Huang, Kelvin C.K. Chan, Yuming Jiang, Ziwei Liu
  • Subjects: Computer Vision and Pattern Recognition (cs.CV)
  • Arxiv link: https://arxiv.org/abs/2304.10530
  • Pdf link: https://arxiv.org/pdf/2304.10530
  • Abstract
    Diffusion models arise as a powerful generative tool recently. Despite the great progress, existing diffusion models mainly focus on uni-modal control, i.e., the diffusion process is driven by only one modality of condition. To further unleash the users' creativity, it is desirable for the model to be controllable by multiple modalities simultaneously, e.g., generating and editing faces by describing the age (text-driven) while drawing the face shape (mask-driven). In this work, we present Collaborative Diffusion, where pre-trained uni-modal diffusion models collaborate to achieve multi-modal face generation and editing without re-training. Our key insight is that diffusion models driven by different modalities are inherently complementary regarding the latent denoising steps, where bilateral connections can be established upon. Specifically, we propose dynamic diffuser, a meta-network that adaptively hallucinates multi-modal denoising steps by predicting the spatial-temporal influence functions for each pre-trained uni-modal model. Collaborative Diffusion not only collaborates generation capabilities from uni-modal diffusion models, but also integrates multiple uni-modal manipulations to perform multi-modal editing. Extensive qualitative and quantitative experiments demonstrate the superiority of our framework in both image quality and condition consistency.
@A-suozhang A-suozhang self-assigned this Apr 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant