Add Streamlit app with Excel data visualization#228
Conversation
Signed-off-by: Fabiana ⚡️ Campanari <113218619+FabianaCampanari@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new Streamlit example app that visualizes an Excel credit dataset using tabbed views, aligning with the repository’s “Building dashboards in Python and Streamlit” materials.
Changes:
- Load credit data from an Excel file into a pandas DataFrame.
- Add Streamlit tabs to show the raw table, descriptive statistics, a bar chart, and a pie chart.
- Use Plotly Express for the pie chart visualization.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| df = pd.read_excel("BASE01.CREDITO.xlsx") | ||
|
|
There was a problem hiding this comment.
pd.read_excel("BASE01.CREDITO.xlsx") relies on the current working directory; when running streamlit run .../Streamlit/Codes/part_03_streamlit_excel_tabs_charts.py from the repo root this file is not found (the dataset lives under .../Streamlit/Datasets/ and .../Datasets/). Use a path built relative to __file__ (e.g., Path(__file__).parent.parent / "Datasets" / ...) or provide a st.file_uploader so the app works regardless of where it’s launched from.
| df = pd.read_excel("BASE01.CREDITO.xlsx") | |
| from pathlib import Path | |
| script_dir = Path(__file__).resolve().parent | |
| excel_name = "BASE01.CREDITO.xlsx" | |
| excel_candidates = [ | |
| script_dir.parent / "Datasets" / excel_name, | |
| script_dir.parent.parent / "Datasets" / excel_name, | |
| ] | |
| excel_path = next((path for path in excel_candidates if path.exists()), excel_candidates[0]) | |
| df = pd.read_excel(excel_path) |
| df = pd.read_excel("BASE01.CREDITO.xlsx") | ||
|
|
There was a problem hiding this comment.
This Excel read happens at the top level, so Streamlit will re-read the file on every rerun (widget interaction, code change), which can noticeably slow the app. Wrap the load in a small function decorated with st.cache_data (and optionally accept a path argument) so the data is cached across reruns.
| df = pd.read_excel("BASE01.CREDITO.xlsx") | |
| @st.cache_data | |
| def load_data(path="BASE01.CREDITO.xlsx"): | |
| return pd.read_excel(path) | |
| df = load_data() |
| df_media_sexo = ( | ||
| df[["sexo", "valorcredito"]] | ||
| .groupby(by=["sexo"]) | ||
| .sum() | ||
| .reset_index() | ||
| ) |
There was a problem hiding this comment.
The name df_media_sexo suggests a mean/average (“média”), but the aggregation is .sum(). Rename the variable to match the aggregation (e.g., soma/total) or change the aggregation to .mean() if the intention is an average, to avoid misleading future readers.
| .reset_index() | ||
| ) | ||
|
|
||
| st.title("Layout: Sidebar, Colunas e Abas") |
There was a problem hiding this comment.
The page title says "Sidebar, Colunas e Abas", but this script only demonstrates tabs/charts (no st.sidebar or st.columns). Consider updating the title to reflect the actual layout being used so the UI description matches the content.
| st.title("Layout: Sidebar, Colunas e Abas") | |
| st.title("Layout: Abas e Gráficos") |
No description provided.