fix(route): Cool Papers #15223

nczitzk · 2024-04-13T15:58:27Z

Involved Issue / 该 PR 相关 Issue

Example for the Proposed Route(s) / 路由地址示例

/papers/arxiv/cs.AI

New RSS Route Checklist / 新 RSS 路由检查表

New Route / 新的路由
- Follows Script Standard / 跟随路由规范
Anti-bot or rate limit / 反爬/频率限制
- If yes, do your code reflect this sign? / 如果有, 是否有对应的措施?
Date and time / 日期和时间
- Parsed / 可以解析
- Correct time zone / 时区正确
New package added / 添加了新的包
Puppeteer

Note / 说明

github-actions · 2024-04-13T16:02:57Z

Successfully generated as following:

http://localhost:1200/papers/arxiv/cs.AI - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Artificial Intelligence</title>
    <link>https://papers.cool/arxiv/cs.AI</link>
    <atom:link href="http://localhost:1200/papers/arxiv/cs.AI" rel="self" type="application/rss+xml"></atom:link>
    <description>Artificial Intelligence - Made with love by RSSHub(https://github.com/DIYgod/RSSHub)</description>
    <generator>RSSHub</generator>
    <webMaster>i@diygod.me (DIYgod)</webMaster>
    <language>en</language>
    <lastBuildDate>Sat, 13 Apr 2024 16:02:55 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>Uncertainty-guided annotation enhances segmentation with the human-in-the-loop</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07208.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07208&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07208&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Deep learning algorithms, often critiqued for their &#39;black box&#39; nature, traditionally fall short in providing the necessary transparency for trusted clinical use. This challenge is particularly evident when such models are deployed in local hospitals, encountering out-of-domain distributions due to varying imaging techniques and patient-specific pathologies. Yet, this limitation offers a unique avenue for continual learning. The Uncertainty-Guided Annotation (UGA) framework introduces a human-in-the-loop approach, enabling AI to convey its uncertainties to clinicians, effectively acting as an automated quality control mechanism. UGA eases this interaction by quantifying uncertainty at the pixel level, thereby revealing the model&#39;s limitations and opening the door for clinician-guided corrections. We evaluated UGA on the Camelyon dataset for lymph node metastasis segmentation which revealed that UGA improved the Dice coefficient (DC), from 0.66 to 0.76 by adding 5 patches, and further to 0.84 with 10 patches. To foster broader application and community contribution, we have made our code accessible at&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07208</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07208</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07208.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>A real-time Artificial Intelligence system for learning Sign Language</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07211.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07211&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07211&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;A primary challenge for the deaf and hearing-impaired community stems from the communication gap with the hearing society, which can greatly impact their daily lives and result in social exclusion. To foster inclusivity in society, our endeavor focuses on developing a cost-effective, resource-efficient, and open technology based on Artificial Intelligence, designed to assist people in learning and using Sign Language for communication. The analysis presented in this research paper intends to enrich the recent academic scientific literature on Sign Language solutions based on Artificial Intelligence, with a particular focus on American Sign Language (ASL). This research has yielded promising preliminary results and serves as a basis for further development.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07211</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07211</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07211.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Hybrid Training of Denoising Networks to Improve the Texture Acutance of Digital Cameras</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07212.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07212&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07212&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;In order to evaluate the capacity of a camera to render textures properly, the standard practice, used by classical scoring protocols, is to compute the frequential response to a dead leaves image target, from which is built a texture acutance metric. In this work, we propose a mixed training procedure for image restoration neural networks, relying on both natural and synthetic images, that yields a strong improvement of this acutance metric without impairing fidelity terms. The feasibility of the approach is demonstrated both on the denoising of RGB images and the full development of RAW images, opening the path to a systematic improvement of the texture acutance of real imaging devices.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07212</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07212</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07212.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Evolving Genetic Programming Tree Models for Predicting the Mechanical Properties of Green Fibers for Better Biocomposite Materials</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07213.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07213&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07213&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Advanced modern technology and industrial sustainability theme have contributed implementing composite materials for various industrial applications. Green composites are among the desired alternatives for the green products. However, to properly control the performance of the green composites, predicting their constituents properties are of paramount importance. This work presents an innovative evolving genetic programming tree models for predicting the mechanical properties of natural fibers based upon several inherent chemical and physical properties. Cellulose, hemicellulose, lignin and moisture contents as well as the Microfibrillar angle of various natural fibers were considered to establish the prediction models. A one-hold-out methodology was applied for training/testing phases. Robust models were developed to predict the tensile strength, Young&#39;s modulus, and the elongation at break properties of the natural fibers. It was revealed that Microfibrillar angle was dominant and capable of determining the ultimate tensile strength of the natural fibers by 44.7% comparable to other considered properties, while the impact of cellulose content in the model was only 35.6%. This in order would facilitate utilizing artificial intelligence in predicting the overall mechanical properties of natural fibers without experimental efforts and cost to enhance developing better green composite materials for various industrial applications.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07213</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07213</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07213.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07214.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07214&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07214&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;The advent of Large Language Models (LLMs) has significantly reshaped the trajectory of the AI revolution. Nevertheless, these LLMs exhibit a notable limitation, as they are primarily adept at processing textual information. To address this constraint, researchers have endeavored to integrate visual capabilities with LLMs, resulting in the emergence of Vision-Language Models (VLMs). These advanced models are instrumental in tackling more intricate tasks such as image captioning and visual question answering. In our comprehensive survey paper, we delve into the key advancements within the realm of VLMs. Our classification organizes VLMs into three distinct categories: models dedicated to vision-language understanding, models that process multimodal inputs to generate unimodal (textual) outputs and models that both accept and produce multimodal inputs and outputs.This classification is based on their respective capabilities and functionalities in processing and generating various modalities of data.We meticulously dissect each model, offering an extensive analysis of its foundational architecture, training data sources, as well as its strengths and limitations wherever possible, providing readers with a comprehensive understanding of its essential components. We also analyzed the performance of VLMs in various benchmark datasets. By doing so, we aim to offer a nuanced understanding of the diverse landscape of VLMs. Additionally, we underscore potential avenues for future research in this dynamic domain, anticipating further breakthroughs and advancements.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07214</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07214</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07214.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Computation Offloading for Multi-server Multi-access Edge Vehicular Networks: A DDQN-based Method</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07215.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07215&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07215&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;In this paper, we investigate a multi-user offloading problem in the overlapping domain of a multi-server mobile edge computing system. We divide the original problem into two stages: the offloading decision making stage and the request scheduling stage. To prevent the terminal from going out of service area during offloading, we consider the mobility parameter of the terminal according to the human behaviour model when making the offloading decision, and then introduce a server evaluation mechanism based on both the mobility parameter and the server load to select the optimal offloading server. In order to fully utilise the server resources, we design a double deep Q-network (DDQN)-based reward evaluation algorithm that considers the priority of tasks when scheduling offload requests. Finally, numerical simulations are conducted to verify that our proposed method outperforms traditional mathematical computation methods as well as the DQN algorithm.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07215</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07215</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07215.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>A Bio-Medical Snake Optimizer System Driven by Logarithmic Surviving Global Search for Optimizing Feature Selection and its application for Disorder R...</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07216.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07216&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07216&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;It is of paramount importance to enhance medical practices, given how important it is to protect human life. Medical therapy can be accelerated by automating patient prediction using machine learning techniques. To double the efficiency of classifiers, several preprocessing strategies must be adopted for their crucial duty in this field. Feature selection (FS) is one tool that has been used frequently to modify data and enhance classification outcomes by lowering the dimensionality of datasets. Excluded features are those that have a poor correlation coefficient with the label class, that is, they have no meaningful correlation with classification and do not indicate where the instance belongs. Along with the recurring features, which show a strong association with the remainder of the features. Contrarily, the model being produced during training is harmed, and the classifier is misled by their presence. This causes overfitting and increases algorithm complexity and processing time. These are used in exploration to allow solutions to be found more thoroughly and in relation to a chosen solution than at random. TLSO, PLSO, and LLSO stand for Tournament Logarithmic Snake Optimizer, Proportional Logarithmic Snake Optimizer, and Linear Order Logarithmic Snake Optimizer, respectively. A number of 22 reference medical datasets were used in experiments. The findings indicate that, among 86 % of the datasets, TLSO attained the best accuracy, and among 82 % of the datasets, the best feature reduction. In terms of the standard deviation, the TLSO also attained noteworthy reliability and stability. On the basis of running duration, it is, nonetheless, quite effective.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07216</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07216</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07216.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Attention-aware Semantic Communications for Collaborative Inference</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07217.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07217&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07217&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;We propose a communication-efficient collaborative inference framework in the domain of edge inference, focusing on the efficient use of vision transformer (ViTs) models. The partitioning strategy of conventional collaborative inference fails to reduce communication cost because of the inherent architecture of ViTs maintaining consistent layer dimensions across the entire transformer encoder. Therefore, instead of employing the partitioning strategy, our framework utilizes a lightweight ViT model on the edge device, with the server deploying a complicated ViT model. To enhance communication efficiency and achieve the classification accuracy of the server model, we propose two strategies: 1) attention-aware patch selection and 2) entropy-aware image transmission. Attention-aware patch selection leverages the attention scores generated by the edge device&#39;s transformer encoder to identify and select the image patches critical for classification. This strategy enables the edge device to transmit only the essential patches to the server, significantly improving communication efficiency. Entropy-aware image transmission uses min-entropy as a metric to accurately determine whether to depend on the lightweight model on the edge device or to request the inference from the server model. In our framework, the lightweight ViT model on the edge device acts as a semantic encoder, efficiently identifying and selecting the crucial image information required for the classification task. Our experiments demonstrate that the proposed collaborative inference framework can reduce communication overhead by 68% with only a minimal loss in accuracy compared to the server model.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07217</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07217</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07217.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07220.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07220&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07220&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Retrieval-Augmented Generation (RAG) is a prevalent approach to infuse a private knowledge base of documents with Large Language Models (LLM) to build Generative Q\&amp;amp;A (Question-Answering) systems. However, RAG accuracy becomes increasingly challenging as the corpus of documents scales up, with Retrievers playing an outsized role in the overall RAG accuracy by extracting the most relevant document from the corpus to provide context to the LLM. In this paper, we propose the &#39;Blended RAG&#39; method of leveraging semantic search techniques, such as Dense Vector indexes and Sparse Encoder indexes, blended with hybrid query strategies. Our study achieves better retrieval results and sets new benchmarks for IR (Information Retrieval) datasets like NQ and TREC-COVID datasets. We further extend such a &#39;Blended Retriever&#39; to the RAG system to demonstrate far superior results on Generative Q\&amp;amp;A datasets like SQUAD, even surpassing fine-tuning performance.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07220</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07220</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07220.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Stock Recommendations for Individual Investors: A Temporal Graph Network Approach with Diversification-Enhancing Contrastive Learning</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07223.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07223&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07223&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;In complex financial markets, recommender systems can play a crucial role in empowering individuals to make informed decisions. Existing studies predominantly focus on price prediction, but even the most sophisticated models cannot accurately predict stock prices. Also, many studies show that most individual investors do not follow established investment theories because they have their own preferences. Hence, the tricky point in stock recommendation is that recommendations should give good investment performance but also should not ignore individual preferences. To develop effective stock recommender systems, it is essential to consider three key aspects: 1) individual preferences, 2) portfolio diversification, and 3) temporal aspect of both stock features and individual preferences. In response, we develop the portfolio temporal graph network recommender PfoTGNRec, which can handle time-varying collaborative signals and incorporates diversification-enhancing contrastive learning. As a result, our model demonstrated superior performance compared to various baselines, including cutting-edge dynamic embedding models and existing stock recommendation models, in a sense that our model exhibited good investment performance while maintaining competitive in capturing individual preferences. The source code and data are available at https://anonymous.4open.science/r/IJCAI2024-12F4.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07223</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07223</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07223.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Unveiling the Impact of Macroeconomic Policies: A Double Machine Learning Approach to Analyzing Interest Rate Effects on Financial Markets</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07225.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07225&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07225&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;This study examines the effects of macroeconomic policies on financial markets using a novel approach that combines Machine Learning (ML) techniques and causal inference. It focuses on the effect of interest rate changes made by the US Federal Reserve System (FRS) on the returns of fixed income and equity funds between January 1986 and December 2021. The analysis makes a distinction between actively and passively managed funds, hypothesizing that the latter are less susceptible to changes in interest rates. The study contrasts gradient boosting and linear regression models using the Double Machine Learning (DML) framework, which supports a variety of statistical learning techniques. Results indicate that gradient boosting is a useful tool for predicting fund returns; for example, a 1% increase in interest rates causes an actively managed fund&#39;s return to decrease by -11.97%. This understanding of the relationship between interest rates and fund performance provides opportunities for additional research and insightful, data-driven advice for fund managers and investors&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07225</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07225</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07225.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Personality-affected Emotion Generation in Dialog Systems</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07229.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07229&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07229&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Generating appropriate emotions for responses is essential for dialog systems to provide human-like interaction in various application scenarios. Most previous dialog systems tried to achieve this goal by learning empathetic manners from anonymous conversational data. However, emotional responses generated by those methods may be inconsistent, which will decrease user engagement and service quality. Psychological findings suggest that the emotional expressions of humans are rooted in personality traits. Therefore, we propose a new task, Personality-affected Emotion Generation, to generate emotion based on the personality given to the dialog system and further investigate a solution through the personality-affected mood transition. Specifically, we first construct a daily dialog dataset, Personality EmotionLines Dataset (PELD), with emotion and personality annotations. Subsequently, we analyze the challenges in this task, i.e., (1) heterogeneously integrating personality and emotional factors and (2) extracting multi-granularity emotional information in the dialog context. Finally, we propose to model the personality as the transition weight by simulating the mood transition process in the dialog system and solve the challenges above. We conduct extensive experiments on PELD for evaluation. Results suggest that by adopting our method, the emotion generation performance is improved by 13% in macro-F1 and 5% in weighted-F1 from the BERT-base model.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07229</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07229</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07229.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Interval-valued fuzzy soft $β$-covering approximation spaces</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07230.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07230&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07230&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;The concept of interval-valued fuzzy soft $\beta$-covering approximation spaces (IFS$\beta$CASs) is introduced to combine the theories of soft sets, rough sets and interval-valued fuzzy sets, and some fundamental propositions concerning interval-valued fuzzy soft $\beta$-neighborhoods and soft $\beta$-neighborhoods of IFS$\beta$CASs are explored. And then four kinds of interval-valued fuzzy soft $\beta$-coverings based fuzzy rough sets are researched. Finally, the relationships of four kinds of interval-valued fuzzy soft $\beta$-coverings based fuzzy rough sets are investigated.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07230</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07230</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07230.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Goal-guided Generative Prompt Injection Attack on Large Language Models</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07234.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07234&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07234&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Current large language models (LLMs) provide a strong foundation for large-scale user-oriented natural language tasks. A large number of users can easily inject adversarial text or instructions through the user interface, thus causing LLMs model security challenges. Although there is currently a large amount of research on prompt injection attacks, most of these black-box attacks use heuristic strategies. It is unclear how these heuristic strategies relate to the success rate of attacks and thus effectively improve model robustness. To solve this problem, we redefine the goal of the attack: to maximize the KL divergence between the conditional probabilities of the clean text and the adversarial text. Furthermore, we prove that maximizing the KL divergence is equivalent to maximizing the Mahalanobis distance between the embedded representation $x$ and $x&#39;$ of the clean text and the adversarial text when the conditional probability is a Gaussian distribution and gives a quantitative relationship on $x$ and $x&#39;$. Then we designed a simple and effective goal-guided generative prompt injection strategy (G2PIA) to find an injection text that satisfies specific constraints to achieve the optimal attack effect approximately. It is particularly noteworthy that our attack method is a query-free black-box attack method with low computational cost. Experimental results on seven LLM models and four datasets show the effectiveness of our attack method.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07234</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07234</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07234.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Explaining EDA synthesis errors with LLMs</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07235.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07235&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07235&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Training new engineers in digital design is a challenge, particularly when it comes to teaching the complex electronic design automation (EDA) tooling used in this domain. Learners will typically deploy designs in the Verilog and VHDL hardware description languages to Field Programmable Gate Arrays (FPGAs) from Altera (Intel) and Xilinx (AMD) via proprietary closed-source toolchains (Quartus Prime and Vivado, respectively). These tools are complex and difficult to use -- yet, as they are the tools used in industry, they are an essential first step in this space. In this work, we examine how recent advances in artificial intelligence may be leveraged to address aspects of this challenge. Specifically, we investigate if Large Language Models (LLMs), which have demonstrated text comprehension and question-answering capabilities, can be used to generate novice-friendly explanations of compile-time synthesis error messages from Quartus Prime and Vivado. To perform this study we generate 936 error message explanations using three OpenAI LLMs over 21 different buggy code samples. These are then graded for relevance and correctness, and we find that in approximately 71% of cases the LLMs give correct &amp;amp; complete explanations suitable for novice learners.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07235</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07235</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07235.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Advancements in Radiomics and Artificial Intelligence for Thyroid Cancer Diagnosis</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07239.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07239&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07239&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Thyroid cancer is an increasing global health concern that requires advanced diagnostic methods. The application of AI and radiomics to thyroid cancer diagnosis is examined in this review. A review of multiple databases was conducted in compliance with PRISMA guidelines until October 2023. A combination of keywords led to the discovery of an English academic publication on thyroid cancer and related subjects. 267 papers were returned from the original search after 109 duplicates were removed. Relevant studies were selected according to predetermined criteria after 124 articles were eliminated based on an examination of their abstract and title. After the comprehensive analysis, an additional six studies were excluded. Among the 28 included studies, radiomics analysis, which incorporates ultrasound (US) images, demonstrated its effectiveness in diagnosing thyroid cancer. Various results were noted, some of the studies presenting new strategies that outperformed the status quo. The literature has emphasized various challenges faced by AI models, including interpretability issues, dataset constraints, and operator dependence. The synthesized findings of the 28 included studies mentioned the need for standardization efforts and prospective multicenter studies to address these concerns. Furthermore, approaches to overcome these obstacles were identified, such as advances in explainable AI technology and personalized medicine techniques. The review focuses on how AI and radiomics could transform the diagnosis and treatment of thyroid cancer. Despite challenges, future research on multidisciplinary cooperation, clinical applicability validation, and algorithm improvement holds the potential to improve patient outcomes and diagnostic precision in the treatment of thyroid cancer.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07239</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07239</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07239.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Sandwich attack: Multi-language Mixture Adaptive Attack on LLMs</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07242.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07242&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07242&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Large Language Models (LLMs) are increasingly being developed and applied, but their widespread use faces challenges. These include aligning LLMs&#39; responses with human values to prevent harmful outputs, which is addressed through safety training methods. Even so, bad actors and malicious users have succeeded in attempts to manipulate the LLMs to generate misaligned responses for harmful questions such as methods to create a bomb in school labs, recipes for harmful drugs, and ways to evade privacy rights. Another challenge is the multilingual capabilities of LLMs, which enable the model to understand and respond in multiple languages. Consequently, attackers exploit the unbalanced pre-training datasets of LLMs in different languages and the comparatively lower model performance in low-resource languages than high-resource ones. As a result, attackers use a low-resource languages to intentionally manipulate the model to create harmful responses. Many of the similar attack vectors have been patched by model providers, making the LLMs more robust against language-based manipulation. In this paper, we introduce a new black-box attack vector called the \emph{Sandwich attack}: a multi-language mixture attack, which manipulates state-of-the-art LLMs into generating harmful and misaligned responses. Our experiments with five different models, namely Google&#39;s Bard, Gemini Pro, LLaMA-2-70-B-Chat, GPT-3.5-Turbo, GPT-4, and Claude-3-OPUS, show that this attack vector can be used by adversaries to generate harmful responses and elicit misaligned responses from these models. By detailing both the mechanism and impact of the Sandwich attack, this paper aims to guide future research and development towards more secure and resilient LLMs, ensuring they serve the public good while minimizing potential for misuse.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07242</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07242</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07242.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Generative Resident Separation and Multi-label Classification for Multi-person Activity Recognition</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07245.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07245&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07245&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;This paper presents two models to address the problem of multi-person activity recognition using ambient sensors in a home. The first model, Seq2Res, uses a sequence generation approach to separate sensor events from different residents. The second model, BiGRU+Q2L, uses a Query2Label multi-label classifier to predict multiple activities simultaneously. Performances of these models are compared to a state-of-the-art model in different experimental scenarios, using a state-of-the-art dataset of two residents in a home instrumented with ambient sensors. These results lead to a discussion on the advantages and drawbacks of resident separation and multi-label classification for multi-person activity recognition.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07245</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07245</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07245.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>AI-Guided Defect Detection Techniques to Model Single Crystal Diamond Growth</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07306.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07306&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07306&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;From a process development perspective, diamond growth via chemical vapor deposition has made significant strides. However, challenges persist in achieving high quality and large-area material production. These difficulties include controlling conditions to maintain uniform growth rates for the entire growth surface. As growth progresses, various factors or defect states emerge, altering the uniform conditions. These changes affect the growth rate and result in the formation of crystalline defects at the microscale. However, there is a distinct lack of methods to identify these defect states and their geometry using images taken during the growth process. This paper details seminal work on defect segmentation pipeline using in-situ optical images to identify features that indicate defective states that are visible at the macroscale. Using a semantic segmentation approach as applied in our previous work, these defect states and corresponding derivative features are isolated and classified by their pixel masks. Using an annotation focused human-in-the-loop software architecture to produce training datasets, with modules for selective data labeling using active learning, data augmentations, and model-assisted labeling, our approach achieves effective annotation accuracy and drastically reduces the time and cost of labeling by orders of magnitude. On the model development front, we found that deep learning-based algorithms are the most efficient. They can accurately learn complex representations from feature-rich datasets. Our best-performing model, based on the YOLOV3 and DeeplabV3plus architectures, achieved excellent accuracy for specific features of interest. Specifically, it reached 93.35% accuracy for center defects, 92.83% for polycrystalline defects, and 91.98% for edge defects.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07306</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07306</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07306.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Structured Reinforcement Learning for Media Streaming at the Wireless Edge</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07315.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07315&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07315&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Media streaming is the dominant application over wireless edge (access) networks. The increasing softwarization of such networks has led to efforts at intelligent control, wherein application-specific actions may be dynamically taken to enhance the user experience. The goal of this work is to develop and demonstrate learning-based policies for optimal decision making to determine which clients to dynamically prioritize in a video streaming setting. We formulate the policy design question as a constrained Markov decision problem (CMDP), and observe that by using a Lagrangian relaxation we can decompose it into single-client problems. Further, the optimal policy takes a threshold form in the video buffer length, which enables us to design an efficient constrained reinforcement learning (CRL) algorithm to learn it. Specifically, we show that a natural policy gradient (NPG) based algorithm that is derived using the structure of our problem converges to the globally optimal policy. We then develop a simulation environment for training, and a real-world intelligent controller attached to a WiFi access point for evaluation. We empirically show that the structured learning approach enables fast learning. Furthermore, such a structured policy can be easily deployed due to low computational complexity, leading to policy execution taking only about 15$\mu$s. Using YouTube streaming experiments in a resource constrained scenario, we demonstrate that the CRL approach can increase QoE by over 30%.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07315</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07315</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07315.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Interactive Learning of Physical Object Properties Through Robot Manipulation and Database of Object Measurements</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07344.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07344&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07344&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;This work presents a framework for automatically extracting physical object properties, such as material composition, mass, volume, and stiffness, through robot manipulation and a database of object measurements. The framework involves exploratory action selection to maximize learning about objects on a table. A Bayesian network models conditional dependencies between object properties, incorporating prior probability distributions and uncertainty associated with measurement actions. The algorithm selects optimal exploratory actions based on expected information gain and updates object properties through Bayesian inference. Experimental evaluation demonstrates effective action selection compared to a baseline and correct termination of the experiments if there is nothing more to be learned. The algorithm proved to behave intelligently when presented with trick objects with material properties in conflict with their appearance. The robot pipeline integrates with a logging module and an online database of objects, containing over 24,000 measurements of 63 objects with different grippers. All code and data are publicly available, facilitating automatic digitization of objects and their physical properties through exploratory manipulations.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07344</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07344</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07344.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Addressing the Abstraction and Reasoning Corpus via Procedural Example Generation</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07353.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07353&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07353&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;This work presents code to procedurally generate examples for the ARC training tasks. For each of the 400 tasks, an example generator following the transformation logic of the original examples was created. In effect, the assumed underlying distribution of examples for any given task was reverse engineered by implementing a means to sample from it. An attempt was made to cover an as large as reasonable space of possible examples for each task. That is, whenever the original examples of a given task may be limited in their diversity e.g. by having the dimensions of the grids, the set of symbols or number of objects constant or within tight bounds, even though the transformation does not require it, such constraints were lifted. Having access to not just a few examples per task, as the case for ARC, but instead very many, should enable a wide range of experiments that may be important stepping stones towards making leaps on the benchmark.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07353</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07353</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07353.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>GANsemble for Small and Imbalanced Data Sets: A Baseline for Synthetic Microplastics Data</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07356.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07356&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07356&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Microplastic particle ingestion or inhalation by humans is a problem of growing concern. Unfortunately, current research methods that use machine learning to understand their potential harms are obstructed by a lack of available data. Deep learning techniques in particular are challenged by such domains where only small or imbalanced data sets are available. Overcoming this challenge often involves oversampling underrepresented classes or augmenting the existing data to improve model performance. This paper proposes GANsemble: a two-module framework connecting data augmentation with conditional generative adversarial networks (cGANs) to generate class-conditioned synthetic data. First, the data chooser module automates augmentation strategy selection by searching for the best data augmentation strategy. Next, the cGAN module uses this strategy to train a cGAN for generating enhanced synthetic data. We experiment with the GANsemble framework on a small and imbalanced microplastics data set. A Microplastic-cGAN (MPcGAN) algorithm is introduced, and baselines for synthetic microplastics (SYMP) data are established in terms of Frechet Inception Distance (FID) and Inception Scores (IS). We also provide a synthetic microplastics filter (SYMP-Filter) algorithm to increase the quality of generated SYMP. Additionally, we show the best amount of oversampling with augmentation to fix class imbalance in small microplastics data sets. To our knowledge, this study is the first application of generative AI to synthetically create microplastics data.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07356</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07356</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07356.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Differentially Private GANs for Generating Synthetic Indoor Location Data</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07366.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07366&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07366&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;The advent of location-based services has led to the widespread adoption of indoor localization systems, which enable location tracking of individuals within enclosed spaces such as buildings. While these systems provide numerous benefits such as improved security and personalized services, they also raise concerns regarding privacy violations. As such, there is a growing need for privacy-preserving solutions that can protect users&#39; sensitive location information while still enabling the functionality of indoor localization systems. In recent years, Differentially Private Generative Adversarial Networks (DPGANs) have emerged as a powerful methodology that aims to protect the privacy of individual data points while generating realistic synthetic data similar to original data. DPGANs combine the power of generative adversarial networks (GANs) with the privacy-preserving technique of differential privacy (DP). In this paper, we introduce an indoor localization framework employing DPGANs in order to generate privacy-preserving indoor location data. We evaluate the performance of our framework on a real-world indoor localization dataset and demonstrate its effectiveness in preserving privacy while maintaining the accuracy of the localization system.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07366</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07366</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07366.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Deep Generative Sampling in the Dual Divergence Space: A Data-efficient &amp; Interpretative Approach for Generative AI</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07377.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07377&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07377&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Building on the remarkable achievements in generative sampling of natural images, we propose an innovative challenge, potentially overly ambitious, which involves generating samples of entire multivariate time series that resemble images. However, the statistical challenge lies in the small sample size, sometimes consisting of a few hundred subjects. This issue is especially problematic for deep generative models that follow the conventional approach of generating samples from a canonical distribution and then decoding or denoising them to match the true data distribution. In contrast, our method is grounded in information theory and aims to implicitly characterize the distribution of images, particularly the (global and local) dependency structure between pixels. We achieve this by empirically estimating its KL-divergence in the dual form with respect to the respective marginal distribution. This enables us to perform generative sampling directly in the optimized 1-D dual divergence space. Specifically, in the dual space, training samples representing the data distribution are embedded in the form of various clusters between two end points. In theory, any sample embedded between those two end points is in-distribution w.r.t. the data distribution. Our key idea for generating novel samples of images is to interpolate between the clusters via a walk as per gradients of the dual function w.r.t. the data dimensions. In addition to the data efficiency gained from direct sampling, we propose an algorithm that offers a significant reduction in sample complexity for estimating the divergence of the data distribution with respect to the marginal distribution. We provide strong theoretical guarantees along with an extensive empirical evaluation using many real-world datasets from diverse domains, establishing the superiority of our approach w.r.t. state-of-the-art deep learning methods.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07377</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07377</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07377.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Incorporating Explanations into Human-Machine Interfaces for Trust and Situation Awareness in Autonomous Vehicles</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07383.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07383&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07383&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Autonomous vehicles often make complex decisions via machine learning-based predictive models applied to collected sensor data. While this combination of methods provides a foundation for real-time actions, self-driving behavior primarily remains opaque to end users. In this sense, explainability of real-time decisions is a crucial and natural requirement for building trust in autonomous vehicles. Moreover, as autonomous vehicles still cause serious traffic accidents for various reasons, timely conveyance of upcoming hazards to road users can help improve scene understanding and prevent potential risks. Hence, there is also a need to supply autonomous vehicles with user-friendly interfaces for effective human-machine teaming. Motivated by this problem, we study the role of explainable AI and human-machine interface jointly in building trust in vehicle autonomy. We first present a broad context of the explanatory human-machine systems with the &quot;3W1H&quot; (what, whom, when, how) approach. Based on these findings, we present a situation awareness framework for calibrating users&#39; trust in self-driving behavior. Finally, we perform an experiment on our framework, conduct a user study on it, and validate the empirical findings with hypothesis testing.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07383</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07383</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07383.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>BISCUIT: Scaffolding LLM-Generated Code with Ephemeral UIs in Computational Notebooks</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07387.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07387&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07387&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Novices frequently engage with machine learning tutorials in computational notebooks and have been adopting code generation technologies based on large language models (LLMs). However, they encounter difficulties in understanding and working with code produced by LLMs. To mitigate these challenges, we introduce a novel workflow into computational notebooks that augments LLM-based code generation with an additional ephemeral UI step, offering users UI-based scaffolds as an intermediate stage between user prompts and code generation. We present this workflow in BISCUIT, an extension for JupyterLab that provides users with ephemeral UIs generated by LLMs based on the context of their code and intentions, scaffolding users to understand, guide, and explore with LLM-generated code. Through 10 user studies where novices used BISCUIT for machine learning tutorials, we discover that BISCUIT offers user semantic representation of code to aid their understanding, reduces the complexity of prompt engineering, and creates a playground for users to explore different variables and iterate on their ideas. We discuss the implications of our findings for UI-centric interactive paradigm in code generation LLMs.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07387</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07387</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07387.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>ChatGPT Can Predict the Future when it Tells Stories Set in the Future About the Past</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07396.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07396&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07396&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;This study investigates whether OpenAI&#39;s ChatGPT-3.5 and ChatGPT-4 can accurately forecast future events using two distinct prompting strategies. To evaluate the accuracy of the predictions, we take advantage of the fact that the training data at the time of experiment stopped at September 2021, and ask about events that happened in 2022 using ChatGPT-3.5 and ChatGPT-4. We employed two prompting strategies: direct prediction and what we call future narratives which ask ChatGPT to tell fictional stories set in the future with characters that share events that have happened to them, but after ChatGPT&#39;s training data had been collected. Concentrating on events in 2022, we prompted ChatGPT to engage in storytelling, particularly within economic contexts. After analyzing 100 prompts, we discovered that future narrative prompts significantly enhanced ChatGPT-4&#39;s forecasting accuracy. This was especially evident in its predictions of major Academy Award winners as well as economic trends, the latter inferred from scenarios where the model impersonated public figures like the Federal Reserve Chair, Jerome Powell. These findings indicate that narrative prompts leverage the models&#39; capacity for hallucinatory narrative construction, facilitating more effective data synthesis and extrapolation than straightforward predictions. Our research reveals new aspects of LLMs&#39; predictive capabilities and suggests potential future applications in analytical contexts.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07396</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07396</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07396.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>JetMoE: Reaching Llama2 Performance with 0.1M Dollars</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07413.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07413&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07413&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Large Language Models (LLMs) have achieved remarkable results, but their increasing resource demand has become a major obstacle to the development of powerful and accessible super-human intelligence. This report introduces JetMoE-8B, a new LLM trained with less than $0.1 million, using 1.25T tokens from carefully mixed open-source corpora and 30,000 H100 GPU hours. Despite its low cost, the JetMoE-8B demonstrates impressive performance, with JetMoE-8B outperforming the Llama2-7B model and JetMoE-8B-Chat surpassing the Llama2-13B-Chat model. These results suggest that LLM training can be much more cost-effective than generally thought. JetMoE-8B is based on an efficient Sparsely-gated Mixture-of-Experts (SMoE) architecture, composed of attention and feedforward experts. Both layers are sparsely activated, allowing JetMoE-8B to have 8B parameters while only activating 2B for each input token, reducing inference computation by about 70% compared to Llama2-7B. Moreover, JetMoE-8B is highly open and academia-friendly, using only public datasets and training code. All training parameters and data mixtures have been detailed in this report to facilitate future efforts in the development of open foundation models. This transparency aims to encourage collaboration and further advancements in the field of accessible and efficient LLMs. The model weights are publicly available at https://github.com/myshell-ai/JetMoE.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07413</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07413</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07413.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Data-Driven Portfolio Management for Motion Pictures Industry: A New Data-Driven Optimization Methodology Using a Large Language Model as the Expert</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07434.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07434&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07434&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Portfolio management is one of the unresponded problems of the Motion Pictures Industry (MPI). To design an optimal portfolio for an MPI distributor, it is essential to predict the box office of each project. Moreover, for an accurate box office prediction, it is critical to consider the effect of the celebrities involved in each MPI project, which was impossible with any precedent expert-based method. Additionally, the asymmetric characteristic of MPI data decreases the performance of any predictive algorithm. In this paper, firstly, the fame score of the celebrities is determined using a large language model. Then, to tackle the asymmetric character of MPI&#39;s data, projects are classified. Furthermore, the box office prediction takes place for each class of projects. Finally, using a hybrid multi-attribute decision-making technique, the preferability of each project for the distributor is calculated, and benefiting from a bi-objective optimization model, the optimal portfolio is designed.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07434</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07434</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07434.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Graph Attention Network for Lane-Wise and Topology-Invariant Intersection Traffic Simulation</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07446.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07446&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07446&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Traffic congestion has significant economic, environmental, and social ramifications. Intersection traffic flow dynamics are influenced by numerous factors. While microscopic traffic simulators are valuable tools, they are computationally intensive and challenging to calibrate. Moreover, existing machine-learning approaches struggle to provide lane-specific waveforms or adapt to intersection topology and traffic patterns. In this study, we propose two efficient and accurate &quot;Digital Twin&quot; models for intersections, leveraging Graph Attention Neural Networks (GAT). These attentional graph auto-encoder digital twins capture temporal, spatial, and contextual aspects of traffic within intersections, incorporating various influential factors such as high-resolution loop detector waveforms, signal state records, driving behaviors, and turning-movement counts. Trained on diverse counterfactual scenarios across multiple intersections, our models generalize well, enabling the estimation of detailed traffic waveforms for any intersection approach and exit lanes. Multi-scale error metrics demonstrate that our models perform comparably to microsimulations. The primary application of our study lies in traffic signal optimization, a pivotal area in transportation systems research. These lightweight digital twins can seamlessly integrate into corridor and network signal timing optimization frameworks. Furthermore, our study&#39;s applications extend to lane reconfiguration, driving behavior analysis, and facilitating informed decisions regarding intersection safety and efficiency enhancements. A promising avenue for future research involves extending this approach to urban freeway corridors and integrating it with measures of effectiveness metrics.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07446</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07446</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07446.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>RiskLabs: Predicting Financial Risk Using Large Language Model Based on Multi-Sources Data</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07452.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07452&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07452&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;The integration of Artificial Intelligence (AI) techniques, particularly large language models (LLMs), in finance has garnered increasing academic attention. Despite progress, existing studies predominantly focus on tasks like financial text summarization, question-answering (Q$\&amp;amp;$A), and stock movement prediction (binary classification), with a notable gap in the application of LLMs for financial risk prediction. Addressing this gap, in this paper, we introduce \textbf{RiskLabs}, a novel framework that leverages LLMs to analyze and predict financial risks. RiskLabs uniquely combines different types of financial data, including textual and vocal information from Earnings Conference Calls (ECCs), market-related time series data, and contextual news data surrounding ECC release dates. Our approach involves a multi-stage p

TonyRL · 2024-04-13T16:17:12Z

lib/routes/papers/index.ts

+            target: '/papers/arxiv/cs.AI',
+        },
+        {
+            title: 'arXiv Computation and Language (cs.CL)',
+            source: ['papers.cool/arxiv/cs.CL'],
+            target: '/papers/arxiv/cs.CL',
+        },
+        {
+            title: 'arXiv Computer Vision and Pattern Recognition (cs.CV)',
+            source: ['papers.cool/arxiv/cs.CV'],
+            target: '/papers/arxiv/cs.CV',
+        },
+        {
+            title: 'arXiv Machine Learning (cs.LG)',
+            source: ['papers.cool/arxiv/cs.LG'],
+            target: '/papers/arxiv/cs.LG',


Suggested change

target: '/papers/arxiv/cs.AI',

},

{

title: 'arXiv Computation and Language (cs.CL)',

source: ['papers.cool/arxiv/cs.CL'],

target: '/papers/arxiv/cs.CL',

},

{

title: 'arXiv Computer Vision and Pattern Recognition (cs.CV)',

source: ['papers.cool/arxiv/cs.CV'],

target: '/papers/arxiv/cs.CV',

},

{

title: 'arXiv Machine Learning (cs.LG)',

source: ['papers.cool/arxiv/cs.LG'],

target: '/papers/arxiv/cs.LG',

target: '/arxiv/cs.AI',

},

{

title: 'arXiv Computation and Language (cs.CL)',

source: ['papers.cool/arxiv/cs.CL'],

target: '/arxiv/cs.CL',

},

{

title: 'arXiv Computer Vision and Pattern Recognition (cs.CV)',

source: ['papers.cool/arxiv/cs.CV'],

target: '/arxiv/cs.CV',

},

{

title: 'arXiv Machine Learning (cs.LG)',

source: ['papers.cool/arxiv/cs.LG'],

target: '/arxiv/cs.LG',

github-actions · 2024-04-14T16:02:12Z

Successfully generated as following:

http://localhost:1200/papers/arxiv/cs.AI - Success ✔️

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>Artificial Intelligence</title>
    <link>https://papers.cool/arxiv/cs.AI</link>
    <atom:link href="http://localhost:1200/papers/arxiv/cs.AI" rel="self" type="application/rss+xml"></atom:link>
    <description>Artificial Intelligence - Made with love by RSSHub(https://github.com/DIYgod/RSSHub)</description>
    <generator>RSSHub</generator>
    <webMaster>i@diygod.me (DIYgod)</webMaster>
    <language>en</language>
    <lastBuildDate>Sun, 14 Apr 2024 16:02:11 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>Uncertainty-guided annotation enhances segmentation with the human-in-the-loop</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07208.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07208&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07208&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Deep learning algorithms, often critiqued for their &#39;black box&#39; nature, traditionally fall short in providing the necessary transparency for trusted clinical use. This challenge is particularly evident when such models are deployed in local hospitals, encountering out-of-domain distributions due to varying imaging techniques and patient-specific pathologies. Yet, this limitation offers a unique avenue for continual learning. The Uncertainty-Guided Annotation (UGA) framework introduces a human-in-the-loop approach, enabling AI to convey its uncertainties to clinicians, effectively acting as an automated quality control mechanism. UGA eases this interaction by quantifying uncertainty at the pixel level, thereby revealing the model&#39;s limitations and opening the door for clinician-guided corrections. We evaluated UGA on the Camelyon dataset for lymph node metastasis segmentation which revealed that UGA improved the Dice coefficient (DC), from 0.66 to 0.76 by adding 5 patches, and further to 0.84 with 10 patches. To foster broader application and community contribution, we have made our code accessible at&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07208</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07208</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07208.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>A real-time Artificial Intelligence system for learning Sign Language</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07211.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07211&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07211&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;A primary challenge for the deaf and hearing-impaired community stems from the communication gap with the hearing society, which can greatly impact their daily lives and result in social exclusion. To foster inclusivity in society, our endeavor focuses on developing a cost-effective, resource-efficient, and open technology based on Artificial Intelligence, designed to assist people in learning and using Sign Language for communication. The analysis presented in this research paper intends to enrich the recent academic scientific literature on Sign Language solutions based on Artificial Intelligence, with a particular focus on American Sign Language (ASL). This research has yielded promising preliminary results and serves as a basis for further development.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07211</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07211</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07211.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Hybrid Training of Denoising Networks to Improve the Texture Acutance of Digital Cameras</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07212.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07212&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07212&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;In order to evaluate the capacity of a camera to render textures properly, the standard practice, used by classical scoring protocols, is to compute the frequential response to a dead leaves image target, from which is built a texture acutance metric. In this work, we propose a mixed training procedure for image restoration neural networks, relying on both natural and synthetic images, that yields a strong improvement of this acutance metric without impairing fidelity terms. The feasibility of the approach is demonstrated both on the denoising of RGB images and the full development of RAW images, opening the path to a systematic improvement of the texture acutance of real imaging devices.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07212</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07212</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07212.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Evolving Genetic Programming Tree Models for Predicting the Mechanical Properties of Green Fibers for Better Biocomposite Materials</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07213.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07213&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07213&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Advanced modern technology and industrial sustainability theme have contributed implementing composite materials for various industrial applications. Green composites are among the desired alternatives for the green products. However, to properly control the performance of the green composites, predicting their constituents properties are of paramount importance. This work presents an innovative evolving genetic programming tree models for predicting the mechanical properties of natural fibers based upon several inherent chemical and physical properties. Cellulose, hemicellulose, lignin and moisture contents as well as the Microfibrillar angle of various natural fibers were considered to establish the prediction models. A one-hold-out methodology was applied for training/testing phases. Robust models were developed to predict the tensile strength, Young&#39;s modulus, and the elongation at break properties of the natural fibers. It was revealed that Microfibrillar angle was dominant and capable of determining the ultimate tensile strength of the natural fibers by 44.7% comparable to other considered properties, while the impact of cellulose content in the model was only 35.6%. This in order would facilitate utilizing artificial intelligence in predicting the overall mechanical properties of natural fibers without experimental efforts and cost to enhance developing better green composite materials for various industrial applications.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07213</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07213</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07213.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07214.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07214&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07214&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;The advent of Large Language Models (LLMs) has significantly reshaped the trajectory of the AI revolution. Nevertheless, these LLMs exhibit a notable limitation, as they are primarily adept at processing textual information. To address this constraint, researchers have endeavored to integrate visual capabilities with LLMs, resulting in the emergence of Vision-Language Models (VLMs). These advanced models are instrumental in tackling more intricate tasks such as image captioning and visual question answering. In our comprehensive survey paper, we delve into the key advancements within the realm of VLMs. Our classification organizes VLMs into three distinct categories: models dedicated to vision-language understanding, models that process multimodal inputs to generate unimodal (textual) outputs and models that both accept and produce multimodal inputs and outputs.This classification is based on their respective capabilities and functionalities in processing and generating various modalities of data.We meticulously dissect each model, offering an extensive analysis of its foundational architecture, training data sources, as well as its strengths and limitations wherever possible, providing readers with a comprehensive understanding of its essential components. We also analyzed the performance of VLMs in various benchmark datasets. By doing so, we aim to offer a nuanced understanding of the diverse landscape of VLMs. Additionally, we underscore potential avenues for future research in this dynamic domain, anticipating further breakthroughs and advancements.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07214</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07214</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07214.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Computation Offloading for Multi-server Multi-access Edge Vehicular Networks: A DDQN-based Method</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07215.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07215&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07215&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;In this paper, we investigate a multi-user offloading problem in the overlapping domain of a multi-server mobile edge computing system. We divide the original problem into two stages: the offloading decision making stage and the request scheduling stage. To prevent the terminal from going out of service area during offloading, we consider the mobility parameter of the terminal according to the human behaviour model when making the offloading decision, and then introduce a server evaluation mechanism based on both the mobility parameter and the server load to select the optimal offloading server. In order to fully utilise the server resources, we design a double deep Q-network (DDQN)-based reward evaluation algorithm that considers the priority of tasks when scheduling offload requests. Finally, numerical simulations are conducted to verify that our proposed method outperforms traditional mathematical computation methods as well as the DQN algorithm.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07215</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07215</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07215.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>A Bio-Medical Snake Optimizer System Driven by Logarithmic Surviving Global Search for Optimizing Feature Selection and its application for Disorder R...</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07216.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07216&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07216&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;It is of paramount importance to enhance medical practices, given how important it is to protect human life. Medical therapy can be accelerated by automating patient prediction using machine learning techniques. To double the efficiency of classifiers, several preprocessing strategies must be adopted for their crucial duty in this field. Feature selection (FS) is one tool that has been used frequently to modify data and enhance classification outcomes by lowering the dimensionality of datasets. Excluded features are those that have a poor correlation coefficient with the label class, that is, they have no meaningful correlation with classification and do not indicate where the instance belongs. Along with the recurring features, which show a strong association with the remainder of the features. Contrarily, the model being produced during training is harmed, and the classifier is misled by their presence. This causes overfitting and increases algorithm complexity and processing time. These are used in exploration to allow solutions to be found more thoroughly and in relation to a chosen solution than at random. TLSO, PLSO, and LLSO stand for Tournament Logarithmic Snake Optimizer, Proportional Logarithmic Snake Optimizer, and Linear Order Logarithmic Snake Optimizer, respectively. A number of 22 reference medical datasets were used in experiments. The findings indicate that, among 86 % of the datasets, TLSO attained the best accuracy, and among 82 % of the datasets, the best feature reduction. In terms of the standard deviation, the TLSO also attained noteworthy reliability and stability. On the basis of running duration, it is, nonetheless, quite effective.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07216</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07216</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07216.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Attention-aware Semantic Communications for Collaborative Inference</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07217.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07217&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07217&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;We propose a communication-efficient collaborative inference framework in the domain of edge inference, focusing on the efficient use of vision transformer (ViTs) models. The partitioning strategy of conventional collaborative inference fails to reduce communication cost because of the inherent architecture of ViTs maintaining consistent layer dimensions across the entire transformer encoder. Therefore, instead of employing the partitioning strategy, our framework utilizes a lightweight ViT model on the edge device, with the server deploying a complicated ViT model. To enhance communication efficiency and achieve the classification accuracy of the server model, we propose two strategies: 1) attention-aware patch selection and 2) entropy-aware image transmission. Attention-aware patch selection leverages the attention scores generated by the edge device&#39;s transformer encoder to identify and select the image patches critical for classification. This strategy enables the edge device to transmit only the essential patches to the server, significantly improving communication efficiency. Entropy-aware image transmission uses min-entropy as a metric to accurately determine whether to depend on the lightweight model on the edge device or to request the inference from the server model. In our framework, the lightweight ViT model on the edge device acts as a semantic encoder, efficiently identifying and selecting the crucial image information required for the classification task. Our experiments demonstrate that the proposed collaborative inference framework can reduce communication overhead by 68% with only a minimal loss in accuracy compared to the server model.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07217</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07217</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07217.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07220.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07220&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07220&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Retrieval-Augmented Generation (RAG) is a prevalent approach to infuse a private knowledge base of documents with Large Language Models (LLM) to build Generative Q\&amp;amp;A (Question-Answering) systems. However, RAG accuracy becomes increasingly challenging as the corpus of documents scales up, with Retrievers playing an outsized role in the overall RAG accuracy by extracting the most relevant document from the corpus to provide context to the LLM. In this paper, we propose the &#39;Blended RAG&#39; method of leveraging semantic search techniques, such as Dense Vector indexes and Sparse Encoder indexes, blended with hybrid query strategies. Our study achieves better retrieval results and sets new benchmarks for IR (Information Retrieval) datasets like NQ and TREC-COVID datasets. We further extend such a &#39;Blended Retriever&#39; to the RAG system to demonstrate far superior results on Generative Q\&amp;amp;A datasets like SQUAD, even surpassing fine-tuning performance.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07220</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07220</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07220.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Stock Recommendations for Individual Investors: A Temporal Graph Network Approach with Diversification-Enhancing Contrastive Learning</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07223.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07223&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07223&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;In complex financial markets, recommender systems can play a crucial role in empowering individuals to make informed decisions. Existing studies predominantly focus on price prediction, but even the most sophisticated models cannot accurately predict stock prices. Also, many studies show that most individual investors do not follow established investment theories because they have their own preferences. Hence, the tricky point in stock recommendation is that recommendations should give good investment performance but also should not ignore individual preferences. To develop effective stock recommender systems, it is essential to consider three key aspects: 1) individual preferences, 2) portfolio diversification, and 3) temporal aspect of both stock features and individual preferences. In response, we develop the portfolio temporal graph network recommender PfoTGNRec, which can handle time-varying collaborative signals and incorporates diversification-enhancing contrastive learning. As a result, our model demonstrated superior performance compared to various baselines, including cutting-edge dynamic embedding models and existing stock recommendation models, in a sense that our model exhibited good investment performance while maintaining competitive in capturing individual preferences. The source code and data are available at https://anonymous.4open.science/r/IJCAI2024-12F4.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07223</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07223</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07223.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Unveiling the Impact of Macroeconomic Policies: A Double Machine Learning Approach to Analyzing Interest Rate Effects on Financial Markets</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07225.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07225&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07225&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;This study examines the effects of macroeconomic policies on financial markets using a novel approach that combines Machine Learning (ML) techniques and causal inference. It focuses on the effect of interest rate changes made by the US Federal Reserve System (FRS) on the returns of fixed income and equity funds between January 1986 and December 2021. The analysis makes a distinction between actively and passively managed funds, hypothesizing that the latter are less susceptible to changes in interest rates. The study contrasts gradient boosting and linear regression models using the Double Machine Learning (DML) framework, which supports a variety of statistical learning techniques. Results indicate that gradient boosting is a useful tool for predicting fund returns; for example, a 1% increase in interest rates causes an actively managed fund&#39;s return to decrease by -11.97%. This understanding of the relationship between interest rates and fund performance provides opportunities for additional research and insightful, data-driven advice for fund managers and investors&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07225</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07225</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07225.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Personality-affected Emotion Generation in Dialog Systems</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07229.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07229&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07229&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Generating appropriate emotions for responses is essential for dialog systems to provide human-like interaction in various application scenarios. Most previous dialog systems tried to achieve this goal by learning empathetic manners from anonymous conversational data. However, emotional responses generated by those methods may be inconsistent, which will decrease user engagement and service quality. Psychological findings suggest that the emotional expressions of humans are rooted in personality traits. Therefore, we propose a new task, Personality-affected Emotion Generation, to generate emotion based on the personality given to the dialog system and further investigate a solution through the personality-affected mood transition. Specifically, we first construct a daily dialog dataset, Personality EmotionLines Dataset (PELD), with emotion and personality annotations. Subsequently, we analyze the challenges in this task, i.e., (1) heterogeneously integrating personality and emotional factors and (2) extracting multi-granularity emotional information in the dialog context. Finally, we propose to model the personality as the transition weight by simulating the mood transition process in the dialog system and solve the challenges above. We conduct extensive experiments on PELD for evaluation. Results suggest that by adopting our method, the emotion generation performance is improved by 13% in macro-F1 and 5% in weighted-F1 from the BERT-base model.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07229</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07229</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07229.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Interval-valued fuzzy soft $β$-covering approximation spaces</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07230.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07230&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07230&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;The concept of interval-valued fuzzy soft $\beta$-covering approximation spaces (IFS$\beta$CASs) is introduced to combine the theories of soft sets, rough sets and interval-valued fuzzy sets, and some fundamental propositions concerning interval-valued fuzzy soft $\beta$-neighborhoods and soft $\beta$-neighborhoods of IFS$\beta$CASs are explored. And then four kinds of interval-valued fuzzy soft $\beta$-coverings based fuzzy rough sets are researched. Finally, the relationships of four kinds of interval-valued fuzzy soft $\beta$-coverings based fuzzy rough sets are investigated.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07230</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07230</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07230.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Goal-guided Generative Prompt Injection Attack on Large Language Models</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07234.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07234&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07234&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Current large language models (LLMs) provide a strong foundation for large-scale user-oriented natural language tasks. A large number of users can easily inject adversarial text or instructions through the user interface, thus causing LLMs model security challenges. Although there is currently a large amount of research on prompt injection attacks, most of these black-box attacks use heuristic strategies. It is unclear how these heuristic strategies relate to the success rate of attacks and thus effectively improve model robustness. To solve this problem, we redefine the goal of the attack: to maximize the KL divergence between the conditional probabilities of the clean text and the adversarial text. Furthermore, we prove that maximizing the KL divergence is equivalent to maximizing the Mahalanobis distance between the embedded representation $x$ and $x&#39;$ of the clean text and the adversarial text when the conditional probability is a Gaussian distribution and gives a quantitative relationship on $x$ and $x&#39;$. Then we designed a simple and effective goal-guided generative prompt injection strategy (G2PIA) to find an injection text that satisfies specific constraints to achieve the optimal attack effect approximately. It is particularly noteworthy that our attack method is a query-free black-box attack method with low computational cost. Experimental results on seven LLM models and four datasets show the effectiveness of our attack method.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07234</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07234</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07234.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Explaining EDA synthesis errors with LLMs</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07235.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07235&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07235&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Training new engineers in digital design is a challenge, particularly when it comes to teaching the complex electronic design automation (EDA) tooling used in this domain. Learners will typically deploy designs in the Verilog and VHDL hardware description languages to Field Programmable Gate Arrays (FPGAs) from Altera (Intel) and Xilinx (AMD) via proprietary closed-source toolchains (Quartus Prime and Vivado, respectively). These tools are complex and difficult to use -- yet, as they are the tools used in industry, they are an essential first step in this space. In this work, we examine how recent advances in artificial intelligence may be leveraged to address aspects of this challenge. Specifically, we investigate if Large Language Models (LLMs), which have demonstrated text comprehension and question-answering capabilities, can be used to generate novice-friendly explanations of compile-time synthesis error messages from Quartus Prime and Vivado. To perform this study we generate 936 error message explanations using three OpenAI LLMs over 21 different buggy code samples. These are then graded for relevance and correctness, and we find that in approximately 71% of cases the LLMs give correct &amp;amp; complete explanations suitable for novice learners.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07235</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07235</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07235.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Advancements in Radiomics and Artificial Intelligence for Thyroid Cancer Diagnosis</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07239.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07239&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07239&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Thyroid cancer is an increasing global health concern that requires advanced diagnostic methods. The application of AI and radiomics to thyroid cancer diagnosis is examined in this review. A review of multiple databases was conducted in compliance with PRISMA guidelines until October 2023. A combination of keywords led to the discovery of an English academic publication on thyroid cancer and related subjects. 267 papers were returned from the original search after 109 duplicates were removed. Relevant studies were selected according to predetermined criteria after 124 articles were eliminated based on an examination of their abstract and title. After the comprehensive analysis, an additional six studies were excluded. Among the 28 included studies, radiomics analysis, which incorporates ultrasound (US) images, demonstrated its effectiveness in diagnosing thyroid cancer. Various results were noted, some of the studies presenting new strategies that outperformed the status quo. The literature has emphasized various challenges faced by AI models, including interpretability issues, dataset constraints, and operator dependence. The synthesized findings of the 28 included studies mentioned the need for standardization efforts and prospective multicenter studies to address these concerns. Furthermore, approaches to overcome these obstacles were identified, such as advances in explainable AI technology and personalized medicine techniques. The review focuses on how AI and radiomics could transform the diagnosis and treatment of thyroid cancer. Despite challenges, future research on multidisciplinary cooperation, clinical applicability validation, and algorithm improvement holds the potential to improve patient outcomes and diagnostic precision in the treatment of thyroid cancer.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07239</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07239</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07239.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Sandwich attack: Multi-language Mixture Adaptive Attack on LLMs</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07242.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07242&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07242&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Large Language Models (LLMs) are increasingly being developed and applied, but their widespread use faces challenges. These include aligning LLMs&#39; responses with human values to prevent harmful outputs, which is addressed through safety training methods. Even so, bad actors and malicious users have succeeded in attempts to manipulate the LLMs to generate misaligned responses for harmful questions such as methods to create a bomb in school labs, recipes for harmful drugs, and ways to evade privacy rights. Another challenge is the multilingual capabilities of LLMs, which enable the model to understand and respond in multiple languages. Consequently, attackers exploit the unbalanced pre-training datasets of LLMs in different languages and the comparatively lower model performance in low-resource languages than high-resource ones. As a result, attackers use a low-resource languages to intentionally manipulate the model to create harmful responses. Many of the similar attack vectors have been patched by model providers, making the LLMs more robust against language-based manipulation. In this paper, we introduce a new black-box attack vector called the \emph{Sandwich attack}: a multi-language mixture attack, which manipulates state-of-the-art LLMs into generating harmful and misaligned responses. Our experiments with five different models, namely Google&#39;s Bard, Gemini Pro, LLaMA-2-70-B-Chat, GPT-3.5-Turbo, GPT-4, and Claude-3-OPUS, show that this attack vector can be used by adversaries to generate harmful responses and elicit misaligned responses from these models. By detailing both the mechanism and impact of the Sandwich attack, this paper aims to guide future research and development towards more secure and resilient LLMs, ensuring they serve the public good while minimizing potential for misuse.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07242</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07242</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07242.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Generative Resident Separation and Multi-label Classification for Multi-person Activity Recognition</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07245.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07245&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07245&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;This paper presents two models to address the problem of multi-person activity recognition using ambient sensors in a home. The first model, Seq2Res, uses a sequence generation approach to separate sensor events from different residents. The second model, BiGRU+Q2L, uses a Query2Label multi-label classifier to predict multiple activities simultaneously. Performances of these models are compared to a state-of-the-art model in different experimental scenarios, using a state-of-the-art dataset of two residents in a home instrumented with ambient sensors. These results lead to a discussion on the advantages and drawbacks of resident separation and multi-label classification for multi-person activity recognition.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07245</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07245</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07245.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>AI-Guided Defect Detection Techniques to Model Single Crystal Diamond Growth</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07306.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07306&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07306&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;From a process development perspective, diamond growth via chemical vapor deposition has made significant strides. However, challenges persist in achieving high quality and large-area material production. These difficulties include controlling conditions to maintain uniform growth rates for the entire growth surface. As growth progresses, various factors or defect states emerge, altering the uniform conditions. These changes affect the growth rate and result in the formation of crystalline defects at the microscale. However, there is a distinct lack of methods to identify these defect states and their geometry using images taken during the growth process. This paper details seminal work on defect segmentation pipeline using in-situ optical images to identify features that indicate defective states that are visible at the macroscale. Using a semantic segmentation approach as applied in our previous work, these defect states and corresponding derivative features are isolated and classified by their pixel masks. Using an annotation focused human-in-the-loop software architecture to produce training datasets, with modules for selective data labeling using active learning, data augmentations, and model-assisted labeling, our approach achieves effective annotation accuracy and drastically reduces the time and cost of labeling by orders of magnitude. On the model development front, we found that deep learning-based algorithms are the most efficient. They can accurately learn complex representations from feature-rich datasets. Our best-performing model, based on the YOLOV3 and DeeplabV3plus architectures, achieved excellent accuracy for specific features of interest. Specifically, it reached 93.35% accuracy for center defects, 92.83% for polycrystalline defects, and 91.98% for edge defects.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07306</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07306</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07306.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Structured Reinforcement Learning for Media Streaming at the Wireless Edge</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07315.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07315&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07315&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Media streaming is the dominant application over wireless edge (access) networks. The increasing softwarization of such networks has led to efforts at intelligent control, wherein application-specific actions may be dynamically taken to enhance the user experience. The goal of this work is to develop and demonstrate learning-based policies for optimal decision making to determine which clients to dynamically prioritize in a video streaming setting. We formulate the policy design question as a constrained Markov decision problem (CMDP), and observe that by using a Lagrangian relaxation we can decompose it into single-client problems. Further, the optimal policy takes a threshold form in the video buffer length, which enables us to design an efficient constrained reinforcement learning (CRL) algorithm to learn it. Specifically, we show that a natural policy gradient (NPG) based algorithm that is derived using the structure of our problem converges to the globally optimal policy. We then develop a simulation environment for training, and a real-world intelligent controller attached to a WiFi access point for evaluation. We empirically show that the structured learning approach enables fast learning. Furthermore, such a structured policy can be easily deployed due to low computational complexity, leading to policy execution taking only about 15$\mu$s. Using YouTube streaming experiments in a resource constrained scenario, we demonstrate that the CRL approach can increase QoE by over 30%.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07315</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07315</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07315.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Interactive Learning of Physical Object Properties Through Robot Manipulation and Database of Object Measurements</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07344.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07344&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07344&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;This work presents a framework for automatically extracting physical object properties, such as material composition, mass, volume, and stiffness, through robot manipulation and a database of object measurements. The framework involves exploratory action selection to maximize learning about objects on a table. A Bayesian network models conditional dependencies between object properties, incorporating prior probability distributions and uncertainty associated with measurement actions. The algorithm selects optimal exploratory actions based on expected information gain and updates object properties through Bayesian inference. Experimental evaluation demonstrates effective action selection compared to a baseline and correct termination of the experiments if there is nothing more to be learned. The algorithm proved to behave intelligently when presented with trick objects with material properties in conflict with their appearance. The robot pipeline integrates with a logging module and an online database of objects, containing over 24,000 measurements of 63 objects with different grippers. All code and data are publicly available, facilitating automatic digitization of objects and their physical properties through exploratory manipulations.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07344</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07344</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07344.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Addressing the Abstraction and Reasoning Corpus via Procedural Example Generation</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07353.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07353&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07353&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;This work presents code to procedurally generate examples for the ARC training tasks. For each of the 400 tasks, an example generator following the transformation logic of the original examples was created. In effect, the assumed underlying distribution of examples for any given task was reverse engineered by implementing a means to sample from it. An attempt was made to cover an as large as reasonable space of possible examples for each task. That is, whenever the original examples of a given task may be limited in their diversity e.g. by having the dimensions of the grids, the set of symbols or number of objects constant or within tight bounds, even though the transformation does not require it, such constraints were lifted. Having access to not just a few examples per task, as the case for ARC, but instead very many, should enable a wide range of experiments that may be important stepping stones towards making leaps on the benchmark.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07353</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07353</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07353.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>GANsemble for Small and Imbalanced Data Sets: A Baseline for Synthetic Microplastics Data</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07356.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07356&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07356&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Microplastic particle ingestion or inhalation by humans is a problem of growing concern. Unfortunately, current research methods that use machine learning to understand their potential harms are obstructed by a lack of available data. Deep learning techniques in particular are challenged by such domains where only small or imbalanced data sets are available. Overcoming this challenge often involves oversampling underrepresented classes or augmenting the existing data to improve model performance. This paper proposes GANsemble: a two-module framework connecting data augmentation with conditional generative adversarial networks (cGANs) to generate class-conditioned synthetic data. First, the data chooser module automates augmentation strategy selection by searching for the best data augmentation strategy. Next, the cGAN module uses this strategy to train a cGAN for generating enhanced synthetic data. We experiment with the GANsemble framework on a small and imbalanced microplastics data set. A Microplastic-cGAN (MPcGAN) algorithm is introduced, and baselines for synthetic microplastics (SYMP) data are established in terms of Frechet Inception Distance (FID) and Inception Scores (IS). We also provide a synthetic microplastics filter (SYMP-Filter) algorithm to increase the quality of generated SYMP. Additionally, we show the best amount of oversampling with augmentation to fix class imbalance in small microplastics data sets. To our knowledge, this study is the first application of generative AI to synthetically create microplastics data.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07356</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07356</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07356.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Differentially Private GANs for Generating Synthetic Indoor Location Data</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07366.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07366&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07366&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;The advent of location-based services has led to the widespread adoption of indoor localization systems, which enable location tracking of individuals within enclosed spaces such as buildings. While these systems provide numerous benefits such as improved security and personalized services, they also raise concerns regarding privacy violations. As such, there is a growing need for privacy-preserving solutions that can protect users&#39; sensitive location information while still enabling the functionality of indoor localization systems. In recent years, Differentially Private Generative Adversarial Networks (DPGANs) have emerged as a powerful methodology that aims to protect the privacy of individual data points while generating realistic synthetic data similar to original data. DPGANs combine the power of generative adversarial networks (GANs) with the privacy-preserving technique of differential privacy (DP). In this paper, we introduce an indoor localization framework employing DPGANs in order to generate privacy-preserving indoor location data. We evaluate the performance of our framework on a real-world indoor localization dataset and demonstrate its effectiveness in preserving privacy while maintaining the accuracy of the localization system.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07366</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07366</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07366.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Deep Generative Sampling in the Dual Divergence Space: A Data-efficient &amp; Interpretative Approach for Generative AI</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07377.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07377&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07377&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Building on the remarkable achievements in generative sampling of natural images, we propose an innovative challenge, potentially overly ambitious, which involves generating samples of entire multivariate time series that resemble images. However, the statistical challenge lies in the small sample size, sometimes consisting of a few hundred subjects. This issue is especially problematic for deep generative models that follow the conventional approach of generating samples from a canonical distribution and then decoding or denoising them to match the true data distribution. In contrast, our method is grounded in information theory and aims to implicitly characterize the distribution of images, particularly the (global and local) dependency structure between pixels. We achieve this by empirically estimating its KL-divergence in the dual form with respect to the respective marginal distribution. This enables us to perform generative sampling directly in the optimized 1-D dual divergence space. Specifically, in the dual space, training samples representing the data distribution are embedded in the form of various clusters between two end points. In theory, any sample embedded between those two end points is in-distribution w.r.t. the data distribution. Our key idea for generating novel samples of images is to interpolate between the clusters via a walk as per gradients of the dual function w.r.t. the data dimensions. In addition to the data efficiency gained from direct sampling, we propose an algorithm that offers a significant reduction in sample complexity for estimating the divergence of the data distribution with respect to the marginal distribution. We provide strong theoretical guarantees along with an extensive empirical evaluation using many real-world datasets from diverse domains, establishing the superiority of our approach w.r.t. state-of-the-art deep learning methods.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07377</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07377</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07377.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Incorporating Explanations into Human-Machine Interfaces for Trust and Situation Awareness in Autonomous Vehicles</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07383.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07383&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07383&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Autonomous vehicles often make complex decisions via machine learning-based predictive models applied to collected sensor data. While this combination of methods provides a foundation for real-time actions, self-driving behavior primarily remains opaque to end users. In this sense, explainability of real-time decisions is a crucial and natural requirement for building trust in autonomous vehicles. Moreover, as autonomous vehicles still cause serious traffic accidents for various reasons, timely conveyance of upcoming hazards to road users can help improve scene understanding and prevent potential risks. Hence, there is also a need to supply autonomous vehicles with user-friendly interfaces for effective human-machine teaming. Motivated by this problem, we study the role of explainable AI and human-machine interface jointly in building trust in vehicle autonomy. We first present a broad context of the explanatory human-machine systems with the &quot;3W1H&quot; (what, whom, when, how) approach. Based on these findings, we present a situation awareness framework for calibrating users&#39; trust in self-driving behavior. Finally, we perform an experiment on our framework, conduct a user study on it, and validate the empirical findings with hypothesis testing.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07383</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07383</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07383.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>BISCUIT: Scaffolding LLM-Generated Code with Ephemeral UIs in Computational Notebooks</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07387.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07387&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07387&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Novices frequently engage with machine learning tutorials in computational notebooks and have been adopting code generation technologies based on large language models (LLMs). However, they encounter difficulties in understanding and working with code produced by LLMs. To mitigate these challenges, we introduce a novel workflow into computational notebooks that augments LLM-based code generation with an additional ephemeral UI step, offering users UI-based scaffolds as an intermediate stage between user prompts and code generation. We present this workflow in BISCUIT, an extension for JupyterLab that provides users with ephemeral UIs generated by LLMs based on the context of their code and intentions, scaffolding users to understand, guide, and explore with LLM-generated code. Through 10 user studies where novices used BISCUIT for machine learning tutorials, we discover that BISCUIT offers user semantic representation of code to aid their understanding, reduces the complexity of prompt engineering, and creates a playground for users to explore different variables and iterate on their ideas. We discuss the implications of our findings for UI-centric interactive paradigm in code generation LLMs.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07387</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07387</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07387.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>ChatGPT Can Predict the Future when it Tells Stories Set in the Future About the Past</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07396.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07396&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07396&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;This study investigates whether OpenAI&#39;s ChatGPT-3.5 and ChatGPT-4 can accurately forecast future events using two distinct prompting strategies. To evaluate the accuracy of the predictions, we take advantage of the fact that the training data at the time of experiment stopped at September 2021, and ask about events that happened in 2022 using ChatGPT-3.5 and ChatGPT-4. We employed two prompting strategies: direct prediction and what we call future narratives which ask ChatGPT to tell fictional stories set in the future with characters that share events that have happened to them, but after ChatGPT&#39;s training data had been collected. Concentrating on events in 2022, we prompted ChatGPT to engage in storytelling, particularly within economic contexts. After analyzing 100 prompts, we discovered that future narrative prompts significantly enhanced ChatGPT-4&#39;s forecasting accuracy. This was especially evident in its predictions of major Academy Award winners as well as economic trends, the latter inferred from scenarios where the model impersonated public figures like the Federal Reserve Chair, Jerome Powell. These findings indicate that narrative prompts leverage the models&#39; capacity for hallucinatory narrative construction, facilitating more effective data synthesis and extrapolation than straightforward predictions. Our research reveals new aspects of LLMs&#39; predictive capabilities and suggests potential future applications in analytical contexts.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07396</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07396</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07396.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>JetMoE: Reaching Llama2 Performance with 0.1M Dollars</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07413.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07413&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07413&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Large Language Models (LLMs) have achieved remarkable results, but their increasing resource demand has become a major obstacle to the development of powerful and accessible super-human intelligence. This report introduces JetMoE-8B, a new LLM trained with less than $0.1 million, using 1.25T tokens from carefully mixed open-source corpora and 30,000 H100 GPU hours. Despite its low cost, the JetMoE-8B demonstrates impressive performance, with JetMoE-8B outperforming the Llama2-7B model and JetMoE-8B-Chat surpassing the Llama2-13B-Chat model. These results suggest that LLM training can be much more cost-effective than generally thought. JetMoE-8B is based on an efficient Sparsely-gated Mixture-of-Experts (SMoE) architecture, composed of attention and feedforward experts. Both layers are sparsely activated, allowing JetMoE-8B to have 8B parameters while only activating 2B for each input token, reducing inference computation by about 70% compared to Llama2-7B. Moreover, JetMoE-8B is highly open and academia-friendly, using only public datasets and training code. All training parameters and data mixtures have been detailed in this report to facilitate future efforts in the development of open foundation models. This transparency aims to encourage collaboration and further advancements in the field of accessible and efficient LLMs. The model weights are publicly available at https://github.com/myshell-ai/JetMoE.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07413</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07413</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07413.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Data-Driven Portfolio Management for Motion Pictures Industry: A New Data-Driven Optimization Methodology Using a Large Language Model as the Expert</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07434.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07434&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07434&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Portfolio management is one of the unresponded problems of the Motion Pictures Industry (MPI). To design an optimal portfolio for an MPI distributor, it is essential to predict the box office of each project. Moreover, for an accurate box office prediction, it is critical to consider the effect of the celebrities involved in each MPI project, which was impossible with any precedent expert-based method. Additionally, the asymmetric characteristic of MPI data decreases the performance of any predictive algorithm. In this paper, firstly, the fame score of the celebrities is determined using a large language model. Then, to tackle the asymmetric character of MPI&#39;s data, projects are classified. Furthermore, the box office prediction takes place for each class of projects. Finally, using a hybrid multi-attribute decision-making technique, the preferability of each project for the distributor is calculated, and benefiting from a bi-objective optimization model, the optimal portfolio is designed.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07434</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07434</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07434.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>Graph Attention Network for Lane-Wise and Topology-Invariant Intersection Traffic Simulation</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07446.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07446&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07446&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;Traffic congestion has significant economic, environmental, and social ramifications. Intersection traffic flow dynamics are influenced by numerous factors. While microscopic traffic simulators are valuable tools, they are computationally intensive and challenging to calibrate. Moreover, existing machine-learning approaches struggle to provide lane-specific waveforms or adapt to intersection topology and traffic patterns. In this study, we propose two efficient and accurate &quot;Digital Twin&quot; models for intersections, leveraging Graph Attention Neural Networks (GAT). These attentional graph auto-encoder digital twins capture temporal, spatial, and contextual aspects of traffic within intersections, incorporating various influential factors such as high-resolution loop detector waveforms, signal state records, driving behaviors, and turning-movement counts. Trained on diverse counterfactual scenarios across multiple intersections, our models generalize well, enabling the estimation of detailed traffic waveforms for any intersection approach and exit lanes. Multi-scale error metrics demonstrate that our models perform comparably to microsimulations. The primary application of our study lies in traffic signal optimization, a pivotal area in transportation systems research. These lightweight digital twins can seamlessly integrate into corridor and network signal timing optimization frameworks. Furthermore, our study&#39;s applications extend to lane reconfiguration, driving behavior analysis, and facilitating informed decisions regarding intersection safety and efficiency enhancements. A promising avenue for future research involves extending this approach to urban freeway corridors and integrating it with measures of effectiveness metrics.&lt;/p&gt; </description>
      <link>https://papers.cool/arxiv/2404.07446</link>
      <guid isPermaLink="false">https://papers.cool/arxiv/2404.07446</guid>
      <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
      <enclosure url="https://arxiv.org/pdf/2404.07446.pdf" type="application/pdf"></enclosure>
    </item>
    <item>
      <title>RiskLabs: Predicting Financial Risk Using Large Language Model Based on Multi-Sources Data</title>
      <description>&lt;a href=&quot;https://arxiv.org/pdf/2404.07452.pdf&quot;&gt;[PDF]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/2404.07452&quot;&gt;[Site]&lt;/a&gt; &lt;a href=&quot;https://papers.cool/arxiv/kimi/2404.07452&quot;&gt;[Kimi]&lt;/a&gt; &lt;p&gt;The integration of Artificial Intelligence (AI) techniques, particularly large language models (LLMs), in finance has garnered increasing academic attention. Despite progress, existing studies predominantly focus on tasks like financial text summarization, question-answering (Q$\&amp;amp;$A), and stock movement prediction (binary classification), with a notable gap in the application of LLMs for financial risk prediction. Addressing this gap, in this paper, we introduce \textbf{RiskLabs}, a novel framework that leverages LLMs to analyze and predict financial risks. RiskLabs uniquely combines different types of financial data, including textual and vocal information from Earnings Conference Calls (ECCs), market-related time series data, and contextual news data surrounding ECC release dates. Our approach involves a multi-stage p

fix(route): Cool Papers

f5fb2e4

github-actions bot added Route Auto: Route Test Complete Auto route test has finished on given PR labels Apr 13, 2024

TonyRL reviewed Apr 13, 2024

View reviewed changes

fix radar

e81d92c

TonyRL merged commit 517fa8d into DIYgod:master Apr 14, 2024
27 checks passed

nczitzk deleted the fix/papers branch April 14, 2024 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(route): Cool Papers #15223

fix(route): Cool Papers #15223

nczitzk commented Apr 13, 2024

github-actions bot commented Apr 13, 2024

TonyRL Apr 13, 2024

nczitzk Apr 14, 2024

github-actions bot commented Apr 14, 2024

fix(route): Cool Papers #15223

fix(route): Cool Papers #15223

Conversation

nczitzk commented Apr 13, 2024

Involved Issue / 该 PR 相关 Issue

Example for the Proposed Route(s) / 路由地址示例

New RSS Route Checklist / 新 RSS 路由检查表

Note / 说明

github-actions bot commented Apr 13, 2024

TonyRL Apr 13, 2024

Choose a reason for hiding this comment

nczitzk Apr 14, 2024

Choose a reason for hiding this comment

github-actions bot commented Apr 14, 2024