Skip to content

EFindAI/Awesome-FinFMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 

Repository files navigation

Advancing Financial Engineering with Financial Foundation Models: Progress, Applications, and Challenges

A Survey on Financial Foundation Models

FinLLM Framework

📝 Introduction

This repository is the official companion to the survey paper 《Advancing Financial Engineering with Foundation Models: Progress, Applications, and Challenges》 published in the journal Engineering. The paper systematically reviews the progress, applications, and challenges of Financial Foundation Models (FFMs), covering key categories such as Financial Language Foundation Models (FinLFMs), Financial Time-Series Foundation Models (FinTSFMs), and Financial Visual-Language Foundation Models. It also collates a comprehensive collection of relevant datasets and real-world financial applications enabled by FFMs.

This repo serves as a centralized resource for researchers and practitioners in the field of financial artificial intelligence, providing curated references to seminal papers, open-source code, and benchmark datasets related to financial foundation models.

📖 Table of contents

Awesome Papers

Financial Foundation Models

Financial language foundation models

BERT-style FinLFMs

[1] FinBERT: Financial Sentiment Analysis with Pretrained Language Models

[2] Finbert: A pretrained language model for financial communications.

[3] Finbert: A pre-trained financial language representation model for financial text mining

[4] Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese

[5] WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain

[6] BBT-Fin: Comprehensive Construction of Chinese Financial Domain Pre-trained Language Model, Corpus and Benchmark

GPT-style FinLFMs

[1] PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance code

[2] BloombergGPT:A Large Language Model for Finance FinLLM workshop @ IJCAI2023

[3] InvestLM: A Large Language Model for Investment using Financial Domain Instruction Tuning code

[4] Xuanyuan 2.0: A large chinese financial chat model with hundreds of billions parameters code

[5] PanGu-𝜋:Enhancing Language Model Architectures via Nonlinearity Compensation

[6] Finqwen: Ai+finance.code

[7] Fingpt: Open-source financial large language models code

[8] Ploutos: Towards interpretable stock movement prediction with financial large language model

[9] Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications code

[10] Baichuan4-Finance Technical Report

[11] Disc-finllm: A chinese financial large language model based on multiple experts fine-tuning code

[12] Cfgpt: Chinese financial assistant with large language model. code

[13] No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks code

Reasoning-enhanced FinLFMs

[1] Xuanyuan-finx1.code

[2] Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance code

Financial time-series foundation models

Naive FinTSFMs trained from scratch

[1] Marketgpt: Developing a pretrained transformer (gpt) for modeling financial time series code

[2] A decoder-only foundation model for time-series forecasting code

[3] Financial fine-tuning a large time series model code

[4] Dual adaptation of time-series foundation models for financial forecasting

FinTSFMs adapted from language models

[1] Time-llm: Time series forecasting by reprogramming large language models code

[2] Unitime: A language-empowered unified model for crossdomain time series forecasting code

[3] Sociodojo: Building lifelong analytical agents with real-world text and time series code

Financial visual-language foundation models

[1] Finvis-gpt: A multimodal large language model for financial chart analysis code

[2] Fintral: A family of gpt-4 level multimodal financial large language models code

[3] Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications code

Financial Data

Financial text-based datasets

Task-specific and English-centric datasets

[1] Good debt or bad debt: Detecting semantic orientations in economic texts data

[2] Domain adaption of named entity recognition to support credit risk assessment data

[3] Www’18 open challenge: Financial opinion mining and question answering data

[4] Stock movement prediction from tweets and historical price data

[5] Hybrid deep sequential modeling for social text-driven stock prediction data

[6] Impact of news on the commodity market: Dataset and results data

[7] Fintral: A family of gpt-4 level multimodal financial large language model data

Multi-task integration and language expansion datasets

[1] WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain data

[2] PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance data

[3] FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models data

[4] CFBenchmark: Chinese financial assistant benchmark for large language model data

[5] Financeiq: Chinese financial domain knowledge assessment datasetdata

[6] CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models data

[7] FLAME: Financial Large-Language Model Assessment and Metrics Evaluation data

Cross-lingual and real-world benchmarks

[1] No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks data

[2] A dutch financial large language model data

[3] Benchmarking large language models on cflue - a chinese financial language understanding evaluation dataset data

[4] M³finmeeting: Multi-lingual multimodal benchmark for financial meeting understanding data

[5] FinBen: a holistic financial benchmark for large language models data

[6] AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework data

Financial time-series-related datasets

[1] Google stock prices – training and test data(2012–2017)

[2] S&P 500 historical data (1927–2020)

[3] Modeling long and short-term temporal patterns with deep neural networks data

[4] Bitcoin daily price time series (2010–2020)

[5] Fnspid: A comprehensive financial news dataset in time series data

[6] Fintsb: A comprehensive and practical benchmark for financial time series forecasting data

Financial visual-language-related datasets

[1] Statlog (australian credit approval)data

[2] Statlog (German Credit Data) data

[3] Tat-qa: A question answering benchmark on a hybrid of tabular and textual content in finance data

[4] Finqa: A dataset of numerical reasoning over financial data data

[5] Chartqa:A benchmark for question answering about charts with visual and logical reasoning data

[6] Convfinqa: Exploring the chain of numerical reasoning in conversational finance question answering data

[7] Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert AGI data

[8] Fintral: A family of gpt-4 level multimodal financial large language models data

[9] Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications data

[10] Mme-finance: A multimodal finance benchmark for expert-level understanding and reasoning data

[11] Fcmr: Robust evaluation of financial cross-modal multi-hop reasoning

[12] Famma: A benchmark for financial domain multilingual multimodal question answering data

[13] Finmme: Benchmark dataset for financial multi-modal reasoning evaluation data

FFM-based financial applications

Financial knowledge extraction

[1] No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks

[2] Large language models as financial data annotators: A study on effectiveness and efficiency

[3] Assessing large language models used for extracting table information from annual financial reports

Market prediction

[1] Time-series foundation model for value-at-risk

[2] Can large language models beat wall street? evaluating gpt-4’s impact on financial decision-making with marketsenseai

[3] Enhancing stock timing predictions based on multimodal architecture: Leveraging large language models (llms) for text quality improvement

[4] Can chatgpt improve investment decisions? from a portfolio management perspective

Trading and financial decision-making

[1] Ra-cfgpt: Chinese financial assistant with retrieval-augmented large language model

[2] Finmem: A performance-enhanced llm trading agent with layered memory and character design

[3] Llmfactor: Extracting profitable factors through prompts for explainable stock movement prediction

[4] Can chatgpt improve investment decisions? from a portfolio management perspective

Agent-based financial simulation

[1] Tradingagents: Multi-agents llm financial trading framework

[2] Can large language models trade? testing financial theories with llm agents in market simulations

[3] When ai meets finance (stockagent): Large language model-based stock trading in simulated real-world environments

[4] FinRpt: Dataset, Evaluation System and LLM-based Multi-agent Framework for Equity Research Report Generation code data

Citiation

If you find this survey or the curated resources helpful for your research, please cite the paper as follows:

@article{chen2025advancing,
    title = {Advancing Financial Engineering with Foundation Models: Progress, Applications, and Challenges},
    author = {Chen, Liyuan and Liu, Shuoling and Yan, Jiangpeng and Wang, Xiaoyu and Liu, Henglin and Li, Chuang and Jiao, Kecheng and Ying, Jixuan and Liu, Yang Veronica and Yang, Qiang and Li, Xiu},
    journal = {Engineering},
    year = {2025},
    doi = {10.1016/j.eng.2025.11.029},
    url = {https://doi.org/10.1016/j.eng.2025.11.029}
}

Contribution

We welcome the community to contribute to this repository! If you would like to recommend new papers (including your own work) to be added to the list, please submit an issue in this repository with the following information:

  • Paper title and arXiv/DOI URL
  • Code repository link (if available)
  • Brief classification (e.g., GPT-style FinLFM, Financial time-series dataset, Market Prediction application)

We will review and update the list regularly to keep it comprehensive and up-to-date.

About

Advancing Financial Engineering with Financial Foundation Models: Progress, Applications, and Challenges

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors