Advancing Financial Engineering with Financial Foundation Models: Progress, Applications, and Challenges
This repository is the official companion to the survey paper 《Advancing Financial Engineering with Foundation Models: Progress, Applications, and Challenges》 published in the journal Engineering. The paper systematically reviews the progress, applications, and challenges of Financial Foundation Models (FFMs), covering key categories such as Financial Language Foundation Models (FinLFMs), Financial Time-Series Foundation Models (FinTSFMs), and Financial Visual-Language Foundation Models. It also collates a comprehensive collection of relevant datasets and real-world financial applications enabled by FFMs.
This repo serves as a centralized resource for researchers and practitioners in the field of financial artificial intelligence, providing curated references to seminal papers, open-source code, and benchmark datasets related to financial foundation models.
- Awesome Papers
- Citation
- Contribution
[1] FinBERT: Financial Sentiment Analysis with Pretrained Language Models
[2] Finbert: A pretrained language model for financial communications.
[3] Finbert: A pre-trained financial language representation model for financial text mining
[4] Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese
[5] WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain
[1] PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance code
[2] BloombergGPT:A Large Language Model for Finance FinLLM workshop @ IJCAI2023
[3] InvestLM: A Large Language Model for Investment using Financial Domain Instruction Tuning code
[4] Xuanyuan 2.0: A large chinese financial chat model with hundreds of billions parameters code
[5] PanGu-𝜋:Enhancing Language Model Architectures via Nonlinearity Compensation
[7] Fingpt: Open-source financial large language models code
[8] Ploutos: Towards interpretable stock movement prediction with financial large language model
[9] Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications code
[10] Baichuan4-Finance Technical Report
[11] Disc-finllm: A chinese financial large language model based on multiple experts fine-tuning code
[12] Cfgpt: Chinese financial assistant with large language model. code
[13] No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks code
[1] Xuanyuan-finx1.code
[2] Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance code
[1] Marketgpt: Developing a pretrained transformer (gpt) for modeling financial time series code
[2] A decoder-only foundation model for time-series forecasting code
[3] Financial fine-tuning a large time series model code
[4] Dual adaptation of time-series foundation models for financial forecasting
[1] Time-llm: Time series forecasting by reprogramming large language models code
[2] Unitime: A language-empowered unified model for crossdomain time series forecasting code
[3] Sociodojo: Building lifelong analytical agents with real-world text and time series code
[1] Finvis-gpt: A multimodal large language model for financial chart analysis code
[2] Fintral: A family of gpt-4 level multimodal financial large language models code
[3] Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications code
[1] Good debt or bad debt: Detecting semantic orientations in economic texts data
[2] Domain adaption of named entity recognition to support credit risk assessment data
[3] Www’18 open challenge: Financial opinion mining and question answering data
[4] Stock movement prediction from tweets and historical price data
[5] Hybrid deep sequential modeling for social text-driven stock prediction data
[6] Impact of news on the commodity market: Dataset and results data
[7] Fintral: A family of gpt-4 level multimodal financial large language model data
[1] WHEN FLUE MEETS FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain data
[2] PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance data
[3] FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models data
[4] CFBenchmark: Chinese financial assistant benchmark for large language model data
[5] Financeiq: Chinese financial domain knowledge assessment datasetdata
[6] CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models data
[7] FLAME: Financial Large-Language Model Assessment and Metrics Evaluation data
[1] No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks data
[2] A dutch financial large language model data
[3] Benchmarking large language models on cflue - a chinese financial language understanding evaluation dataset data
[4] M³finmeeting: Multi-lingual multimodal benchmark for financial meeting understanding data
[5] FinBen: a holistic financial benchmark for large language models data
[6] AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework data
[1] Google stock prices – training and test data(2012–2017)
[2] S&P 500 historical data (1927–2020)
[3] Modeling long and short-term temporal patterns with deep neural networks data
[4] Bitcoin daily price time series (2010–2020)
[5] Fnspid: A comprehensive financial news dataset in time series data
[6] Fintsb: A comprehensive and practical benchmark for financial time series forecasting data
[1] Statlog (australian credit approval)data
[2] Statlog (German Credit Data) data
[3] Tat-qa: A question answering benchmark on a hybrid of tabular and textual content in finance data
[4] Finqa: A dataset of numerical reasoning over financial data data
[5] Chartqa:A benchmark for question answering about charts with visual and logical reasoning data
[6] Convfinqa: Exploring the chain of numerical reasoning in conversational finance question answering data
[7] Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert AGI data
[8] Fintral: A family of gpt-4 level multimodal financial large language models data
[9] Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications data
[10] Mme-finance: A multimodal finance benchmark for expert-level understanding and reasoning data
[11] Fcmr: Robust evaluation of financial cross-modal multi-hop reasoning
[12] Famma: A benchmark for financial domain multilingual multimodal question answering data
[13] Finmme: Benchmark dataset for financial multi-modal reasoning evaluation data
[2] Large language models as financial data annotators: A study on effectiveness and efficiency
[3] Assessing large language models used for extracting table information from annual financial reports
[1] Time-series foundation model for value-at-risk
[4] Can chatgpt improve investment decisions? from a portfolio management perspective
[1] Ra-cfgpt: Chinese financial assistant with retrieval-augmented large language model
[2] Finmem: A performance-enhanced llm trading agent with layered memory and character design
[3] Llmfactor: Extracting profitable factors through prompts for explainable stock movement prediction
[4] Can chatgpt improve investment decisions? from a portfolio management perspective
[1] Tradingagents: Multi-agents llm financial trading framework
[2] Can large language models trade? testing financial theories with llm agents in market simulations
[4] FinRpt: Dataset, Evaluation System and LLM-based Multi-agent Framework for Equity Research Report Generation code data
If you find this survey or the curated resources helpful for your research, please cite the paper as follows:
@article{chen2025advancing,
title = {Advancing Financial Engineering with Foundation Models: Progress, Applications, and Challenges},
author = {Chen, Liyuan and Liu, Shuoling and Yan, Jiangpeng and Wang, Xiaoyu and Liu, Henglin and Li, Chuang and Jiao, Kecheng and Ying, Jixuan and Liu, Yang Veronica and Yang, Qiang and Li, Xiu},
journal = {Engineering},
year = {2025},
doi = {10.1016/j.eng.2025.11.029},
url = {https://doi.org/10.1016/j.eng.2025.11.029}
}
We welcome the community to contribute to this repository! If you would like to recommend new papers (including your own work) to be added to the list, please submit an issue in this repository with the following information:
- Paper title and arXiv/DOI URL
- Code repository link (if available)
- Brief classification (e.g., GPT-style FinLFM, Financial time-series dataset, Market Prediction application)
We will review and update the list regularly to keep it comprehensive and up-to-date.
