In [11]:
import os
import re
from IPython.display import display, Markdown
import glob

# Function to find all PNG files in the current directory
def find_png_files():
    # Find all PNG files
    png_files = glob.glob("*.png")
    return sorted(png_files)

# Generate a caption based on the filename
def generate_caption(filename):
    # Remove file extension and replace underscores with spaces
    base_name = os.path.splitext(filename)[0]
    caption = base_name.replace('_', ' ')
    
    # Apply some basic formatting to make it look like a proper caption
    caption = caption.strip()
    if not caption[0].isupper():
        caption = caption[0].upper() + caption[1:]
    
    # If caption doesn't end with period, add one
    if not caption.endswith('.'):
        caption += '.'
    
    return caption

# Define the structure of the LaTeX document
def create_latex_document():
    # Get list of PNG files
    png_files = find_png_files()
    
    if not png_files:
        print("Warning: No PNG files found in the current directory.")
    
    # LaTeX preamble with necessary packages
    latex_code = r'''
\documentclass[12pt]{article}
\usepackage[utf8]{inputenc}
\usepackage[margin=1in]{geometry}
\usepackage{graphicx}
\usepackage{booktabs}
\usepackage{longtable}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{float}
\usepackage{hyperref}
\usepackage{amsmath}
\usepackage{natbib}
\usepackage{setspace}
\usepackage{xcolor}

\title{Bank Asset Composition and Treasury Securities Analysis}
\author{Financial Research Team}
\date{\today}

\begin{document}

\maketitle

\begin{abstract}
This report presents a comprehensive analysis of bank asset compositions, liability structures, and treasury securities performance from 2019 to 2024. Using data from various financial institutions, we examine the relationships between different bank sizes, their portfolio compositions, and market trends. The analysis includes visualizations of asset allocations, liability distributions, treasury bond ETF correlations, and mortgage-backed securities performance. Our findings provide insights into banking sector stability, investment patterns, and potential risk factors based on Q1 2022 data and subsequent market movements.
\end{abstract}

\section{Introduction}
Banking asset composition and liability structure analysis provides critical insights into financial institution stability and market exposure. This report examines detailed breakdowns of asset allocations across different bank size categories, with particular focus on treasury securities, real estate loans, and mortgage-backed securities (MBS). Understanding these relationships is essential for assessing systemic risk within the banking sector and identifying potential vulnerabilities during market volatility periods.

\section{Methodology and Data Sources}
'''

    # First part of the analysis paragraph
    latex_code += r'''
Overcoming Data Challenges and Leveraging Automated Solutions in Financial Analysis

Throughout this project, our primary challenges revolved around the complexities of data acquisition and processing. Many of our data sources were incomplete or inconsistent, requiring us to develop creative solutions to ensure accuracy and reliability. Key columns were often missing, data sets lacked sufficient population, and minor discrepancies in values had a significant impact on our calculations. What initially appeared to be a straightforward analysis quickly became an intricate process of troubleshooting and refinement.

\section{Data Acquisition Challenges}
'''

    # Second part of the analysis paragraph
    latex_code += r'''
One of our most pressing issues was obtaining reliable Treasury Index data. To address this, we leveraged Selenium to automate web scraping directly from the S\&P Global website, ensuring we had access to the most up-to-date information. This approach not only allowed us to capture real-time data efficiently but also incorporated automation into our workflow, reducing manual errors and improving data integrity.

Similarly, we found success in extracting mortgage-backed securities (MBS) ETF data using yfinance. This tool provided an effective means of retrieving historical and real-time market data directly from Yahoo Finance, streamlining our ability to analyze trends and correlations within MBS performance. By integrating yfinance into our pipeline, we automated data retrieval, ensuring that our dataset remained current and comprehensive.

\section{Structured Data Processing}
'''

    # Third part of the analysis paragraph
    latex_code += r'''
A critical component of our analysis involved accessing detailed bank call report information through Wharton Research Data Services (WRDS). WRDS provided a structured and reliable way to extract specific financial variables from regulatory filings, particularly the RCON and RCFD series, which contain essential banking metrics. Through WRDS's SQL interface, we obtained standardized data that allowed for rigorous time series analysis. This approach enabled us to systematically examine regulatory financial data across multiple institutions, capturing significant trends and patterns in bank performance over time.

One of the advantages of using WRDS was its ability to structure and store data efficiently. The system's capability to pull live data and store it in an optimized Parquet format facilitated our subsequent in-depth analysis of financial indicators and their relationships. However, we encountered an issue with the RCON Series 1 data, where key columns were missing from the WRDS-provided dataset. To overcome this, we manually parsed a Parquet file to extract the necessary information, ensuring that our analysis remained comprehensive and accurate. Despite this challenge, WRDS proved invaluable in obtaining the majority of our required data seamlessly.

\section{Analytical Challenges}
'''

    # Fourth part of the analysis paragraph
    latex_code += r'''
Beyond data retrieval, we faced difficulties in performing precise calculations due to slight variations in data points, which could easily skew results. This required us to spend significant time dissecting the numbers, validating our methodology, and refining our calculations. While financial papers often presented these computations as straightforward, the reality involved meticulous verification and cross-referencing.

Ultimately, the combination of web scraping, API-driven data retrieval, and structured SQL queries provided us with a robust framework for tackling these challenges. By integrating these tools, we ensured both the accuracy and efficiency of our data pipeline, allowing for a deeper and more reliable analysis. The experience highlighted the importance of adaptability in working with financial data and reinforced our ability to develop automated solutions for complex data environments.

\newpage

\section{Figures and Tables}
'''

    # Custom order for figures
    # We'll reorder to make figures 9 and 10 appear after figure 11
    reordered_files = []
    
    # First, process figures excluding 9, 10, and 11
    for i, png_file in enumerate(png_files):
        figure_number = i + 1
        # Skip figures 9, 10, and 11 for now
        if figure_number not in [9, 10, 11]:
            reordered_files.append(png_file)
        # Insert figure 11 when we reach it
        elif figure_number == 11:
            reordered_files.append(png_file)  # This is figure 11
            # Then add figures 9 and 10 right after 11
            if len(png_files) >= 10:
                reordered_files.append(png_files[8])  # Figure 9
            if len(png_files) >= 11:
                reordered_files.append(png_files[9])  # Figure 10
    
    # If figures 9 or 10 exist but figure 11 doesn't, add them at the end
    if len(png_files) >= 9 and len(reordered_files) < len(png_files) - 2:
        if len(png_files) >= 9:
            if png_files[8] not in reordered_files:
                reordered_files.append(png_files[8])
        if len(png_files) >= 10:
            if png_files[9] not in reordered_files:
                reordered_files.append(png_files[9])
    
    # Add all figures in the custom order
    for png_file in reordered_files:
        caption = generate_caption(png_file)
        label = f"fig:{os.path.splitext(png_file)[0]}"
        
        latex_code += f'''
\\begin{{figure}}[H]
\\centering
\\includegraphics[width=0.8\\textwidth]{{{png_file}}}
\\caption{{{caption}}}
\\label{{{label}}}
\\end{{figure}}

'''

    # Close the document
    latex_code += r'''
\end{document}
'''

    return latex_code

# Function to save the LaTeX document
def save_latex_document(latex_code, filename='bank_analysis.tex'):
    with open(filename, 'w', encoding='utf-8') as f:
        f.write(latex_code)
    print(f"LaTeX document saved as {filename}")
    return filename

# Main execution
def main():
    # Get list of PNG files
    png_files = find_png_files()
    
    if not png_files:
        print("No PNG files found in the current directory.")
        return
    
    print(f"Found {len(png_files)} PNG files:")
    for png in png_files:
        print(f"  - {png}")
    
    # Generate and save the LaTeX document
    latex_content = create_latex_document()
    tex_filename = save_latex_document(latex_content)
    
    # Display a preview of the document structure
    display(Markdown("## LaTeX Document Structure Preview"))
    display(Markdown("The LaTeX document has been generated with the following sections:"))
    display(Markdown("1. Introduction"))
    display(Markdown("2. Methodology and Data Sources"))
    display(Markdown("3. Data Acquisition Challenges"))
    display(Markdown("4. Structured Data Processing"))
    display(Markdown("5. Analytical Challenges"))
    display(Markdown("6. Figures and Tables"))
    
    # Display information about the PNG files used in reordered sequence
    reordered_indices = []
    regular_indices = list(range(len(png_files)))
    
    # Create the reordered list of indices
    if len(png_files) >= 11:
        # Move figures 9 and 10 to after 11
        for i in range(len(png_files)):
            if i != 8 and i != 9 and i != 10:  # Not 9, 10, or 11
                reordered_indices.append(i)
            elif i == 10:  # This is figure 11
                reordered_indices.append(i)
                reordered_indices.append(8)  # Figure 9
                reordered_indices.append(9)  # Figure 10
    else:
        # Not enough files for reordering
        reordered_indices = regular_indices
    
    display(Markdown("## PNG Files Included in the Document (in display order)"))
    for i, idx in enumerate(reordered_indices):
        if idx < len(png_files):
            png = png_files[idx]
            caption = generate_caption(png)
            # Display the original figure number and the new order
            original_num = idx + 1
            display(Markdown(f"{i+1}. **{png}** (Original Fig. {original_num}): {caption}"))
    
    # Display compilation instructions
    display(Markdown("## Compilation Instructions"))
    display(Markdown("To compile the LaTeX document:"))
    display(Markdown("```bash\npdflatex bank_analysis.tex\n```"))
    display(Markdown("For better bibliography support, run:"))
    display(Markdown("```bash\npdflatex bank_analysis.tex\nbibtex bank_analysis\npdflatex bank_analysis.tex\npdflatex bank_analysis.tex\n```"))

# Execute the main function
if __name__ == "__main__":
    main()

Found 14 PNG files:
  - Aggregate_Bank_Asset_Composition.png
  - Aggregate_Bank_Liability_Composition.png
  - MBS_Price_Data_Over_Time.png
  - MBS_Trading_Volume_Over_Time.png
  - S&P_Treasury_Bond_Series_Over_Time.png
  - S&P_U.S._Treasury_Bond_Index_Over_Time.png
  - Securities_Composition_by_Bank_Size.png
  - Treasury_Correlation_Heatmap.png
  - figure_a1.1.png
  - figure_a1.2.png
  - iShares_Treasury_Bond_ETFs_Over_Time.png
  - table_1.png
  - table_a1.1.png
  - table_a1.2.png
LaTeX document saved as bank_analysis.tex


## LaTeX Document Structure Preview

The LaTeX document has been generated with the following sections:

1. Introduction

2. Methodology and Data Sources

3. Data Acquisition Challenges

4. Structured Data Processing

5. Analytical Challenges

6. Figures and Tables

## PNG Files Included in the Document (in display order)

1. **Aggregate_Bank_Asset_Composition.png** (Original Fig. 1): Aggregate Bank Asset Composition.

2. **Aggregate_Bank_Liability_Composition.png** (Original Fig. 2): Aggregate Bank Liability Composition.

3. **MBS_Price_Data_Over_Time.png** (Original Fig. 3): MBS Price Data Over Time.

4. **MBS_Trading_Volume_Over_Time.png** (Original Fig. 4): MBS Trading Volume Over Time.

5. **S&P_Treasury_Bond_Series_Over_Time.png** (Original Fig. 5): S&P Treasury Bond Series Over Time.

6. **S&P_U.S._Treasury_Bond_Index_Over_Time.png** (Original Fig. 6): S&P U.S. Treasury Bond Index Over Time.

7. **Securities_Composition_by_Bank_Size.png** (Original Fig. 7): Securities Composition by Bank Size.

8. **Treasury_Correlation_Heatmap.png** (Original Fig. 8): Treasury Correlation Heatmap.

9. **iShares_Treasury_Bond_ETFs_Over_Time.png** (Original Fig. 11): IShares Treasury Bond ETFs Over Time.

10. **figure_a1.1.png** (Original Fig. 9): Figure a1.1.

11. **figure_a1.2.png** (Original Fig. 10): Figure a1.2.

12. **table_1.png** (Original Fig. 12): Table 1.

13. **table_a1.1.png** (Original Fig. 13): Table a1.1.

14. **table_a1.2.png** (Original Fig. 14): Table a1.2.

## Compilation Instructions

To compile the LaTeX document:

```bash
pdflatex bank_analysis.tex
```

For better bibliography support, run:

```bash
pdflatex bank_analysis.tex
bibtex bank_analysis
pdflatex bank_analysis.tex
pdflatex bank_analysis.tex
```