# Notes and Advice

## Potential Synergies Between Projects

There is significant synergy between your current project and the new group project that aims to use the same NYC Open Data source. Both projects involve working with the NYC Citywide Payroll data and aim to create visualizations and integrate advanced functionalities like LLMs (Large Language Models). Here are some potential areas of synergy and collaboration:

### Potential Synergies

1. **Data Source**:
   - Both projects use the same NYC Open Data source for Citywide Payroll data. You can share the data extraction and transformation logic to avoid duplication of effort.

2. **ETL Process**:
   - The ETL process you have already implemented can be reused or adapted for the new project. This includes extracting data from the API, transforming it, and loading it into a suitable format for visualization.

3. **Visualization**:
   - Your current project uses Tableau for data visualization. The new project aims to build a Dash app for visualizations. You can share insights and best practices for creating effective visualizations.
   - Consider integrating both Tableau and Dash for a comprehensive visualization solution.

4. **LLM Integration**:
   - Both projects aim to integrate LLMs. You can collaborate on developing and fine-tuning LLMs to provide natural language query capabilities and insights.
   - Share the LangChain integration you have implemented to enhance natural language query responses with detailed field-level insights.

5. **Presentation and Documentation**:
   - Collaborate on creating a comprehensive presentation and documentation for the NYC Open Data Week event. Share your experiences, challenges, and solutions.

### Example Workflow for Collaboration

1. **Data Extraction and Transformation**:
   - Share the `etl.py` module for extracting and transforming data.
   - Adapt the transformation logic as needed for both Tableau and Dash visualizations.

2. **Visualization**:
   - Use Tableau for creating interactive dashboards.
   - Use Dash for building web-based visualizations.
   - Share insights and best practices for both tools.

3. **LLM Integration**:
   - Share the `langchain_agent.py` module for integrating LLMs.
   - Collaborate on fine-tuning the LLMs for natural language queries.

4. **Presentation**:
   - Work together on creating a compelling presentation for the NYC Open Data Week event.
   - Highlight the synergy between Tableau and Dash visualizations and the integration of LLMs.

### Example Code for Dash Integration

Here is an example of how you can integrate Dash for visualizations:

1. **Install Dash**:
   ```sh
   pip install dash
   ```

2. **Create a Dash App**:

### /mnt/c/Users/Boris_Li/OneDrive/Job_search_2024/Software_Engineer_roles/Tableau-DataDev-2024-2025-Hackathon/app/dash_app.py

```python
// filepath: /mnt/c/Users/Boris_Li/OneDrive/Job_search_2024/Software_Engineer_roles/Tableau-DataDev-2024-2025-Hackathon/app/dash_app.py
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.express as px
import pandas as pd

# Initialize the Dash app
app = dash.Dash(__name__)

# Example data
df = pd.read_json("https://data.cityofnewyork.us/resource/k397-673e.json")

# Create a Plotly figure
fig = px.scatter(df, x="base_salary", y="total_pay", color="title_description", title="Citywide Payroll Data")

# Define the layout of the Dash app
app.layout = html.Div(children=[
    html.H1(children='Citywide Payroll Data Visualization'),

    dcc.Graph(
        id='example-graph',
        figure=fig
    )
])

if __name__ == '__main__':
    app.run_server(debug=True)
```

3. **Run the Dash App**:
   ```sh
   python app/dash_app.py
   ```

### Conclusion

By collaborating on the ETL process, visualization, and LLM integration, both projects can benefit from shared knowledge and resources. This collaboration can lead to a more comprehensive and impactful presentation at the NYC Open Data Week event.
