Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

st.data_editor does not return added rows when dataframe from file #6995

Closed
3 of 4 tasks
hEAkahEq opened this issue Jul 12, 2023 · 7 comments · Fixed by #7481
Closed
3 of 4 tasks

st.data_editor does not return added rows when dataframe from file #6995

hEAkahEq opened this issue Jul 12, 2023 · 7 comments · Fixed by #7481
Labels
feature:st.data_editor priority:P2 status:confirmed Bug has been confirmed by the Streamlit team type:bug Something isn't working

Comments

@hEAkahEq
Copy link

hEAkahEq commented Jul 12, 2023

Checklist

  • I have searched the existing issues for similar issues.
  • I added a very descriptive title to this issue.
  • I have provided sufficient information below to help reproduce this issue.

Summary

st.data_editor does not contain added rows when it is created out of a dataframe from a json loaded with st.file_uploader. The rows are visualized, but not accessible in the data_editor object.

  • deleting rows works fine
  • added rows are accessible in a roundabout way via st.session_state["key_of_data_editor"]

A forum post on this issue can be found here: https://discuss.streamlit.io/t/adding-rows-in-st-data-editor-from-loaded-dataframe/46439

image
Saved_dataframe.txt

Reproducible Code Example

Open in Streamlit Cloud

import streamlit as st
import pandas as pd


def onchange():
    st.session_state.file_changed = True

# initialize Dataframe and state variables
if 'df' not in st.session_state:
    st.session_state.df = pd.DataFrame({'name': ["1", "2"], 'Type': None})
    st.session_state.file_changed = False
    st.session_state.loaded = False

# file upload
uploaded_file = st.file_uploader("Choose a file", on_change=onchange)
if uploaded_file is not None:
    st.session_state.loaded = True
    if st.session_state.file_changed == True:
        st.session_state.df = pd.read_json(uploaded_file)
        st.session_state.file_changed = False

# create editable Dataframe
edited_df = st.data_editor(
    st.session_state.df,
    num_rows="dynamic",
    hide_index=True,  # this is relevant for the workaround, not sure about the bug
    key='demo_df'
)

# bug: edited_df only contains the entire visualized content including added rows, if it was not loaded from file
# deleted rows are treated correctly though
st.write("edited_df:")
st.write(edited_df)
# path to workaround: the changes (from the loaded file) are visible in the df in session_state when accessed by key
st.write(st.session_state["demo_df"])

Steps To Reproduce

  1. add row to data_editor --> OK: output dataframe shows added row
  2. load example json file (attached or create a file with this content: {"name": {"0": "0", "1": "1", "2": "2", "3": "3"}, "Type": {"0": null, "1": null, "2": null, "3": null}})
  3. add row to data editor --> bug: output dataframe does not show added row

a workaround to get the visualized data:
if st.session_state.loaded: fixed_df = pd.concat([edited_df, pd.DataFrame(st.session_state["demo_df"].get("added_rows"))]) else: fixed_df = edited_df st.write(fixed_df)

Expected Behavior

data_editor should behave the same no matter where the dataframe used to create it comes from.

Current Behavior

edited_df does not contain rows added in the GUI when it was created from a dataframe loaded via st.file_uploader

Is this a regression?

  • Yes, this used to work in a previous version.

Debug info

  • Streamlit version: 1.24.1
  • Python version: 3.11
  • Operating System: Win10
  • Browser: Opera, Chrome

Additional Information

No response

@hEAkahEq hEAkahEq added status:needs-triage Has not been triaged by the Streamlit team type:bug Something isn't working labels Jul 12, 2023
@LukasMasuch
Copy link
Collaborator

@hEAkahEq Thanks for reporting this issue! I was able to reproduce it here. This indeed looks like a bug and it needs a bit more investigation to figure out the root cause for this issue.

@LukasMasuch LukasMasuch added status:confirmed Bug has been confirmed by the Streamlit team priority:P2 and removed status:needs-triage Has not been triaged by the Streamlit team labels Jul 13, 2023
@hEAkahEq
Copy link
Author

Thank you Lukas.
Maybe some additional information will help: After more investigating it seems to me that this could be related to the dataframes with different origins having different "kinds" of indexes.
When not using hide_index=True, adding rows works, but the data_editor shows one additional column.

image
image

And in the debugger it looks like "index" has a different type ("Index" vs. "RangeIndex"), but I don't understand pandas well enough to know if this is significant.

@hEAkahEq
Copy link
Author

I now have an easy workaround in case others encounter this problem:
I just add st.session_state.df.reset_index(drop=True, inplace=True) at the end of the file upload. This ensures the st.session_state.df has a "RangeIndex", which is probably what data_editor implicitly expects.

@LukasMasuch
Copy link
Collaborator

@hEAkahEq Yep, that is indeed the issue here. If the index of the dataframe is not a RangeIndex, we cannot automatically insert this row into the dataframe without a non-None index value provided by the user. That is why the index column is also shown as editable.

@LukasMasuch
Copy link
Collaborator

One aspect to improve here from UI perspective is that we set non-RangeIndex columns to required automatically.

@LukasMasuch
Copy link
Collaborator

What Pandas version do you use on your system? Is it solved if you update to the latest Pandas version?

@hEAkahEq
Copy link
Author

@LukasMasuch Thanks for looking into this. I updated Pandas from 2.0.2 to 2.0.3 and it does not appear to be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature:st.data_editor priority:P2 status:confirmed Bug has been confirmed by the Streamlit team type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants