You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I review the construction of this project about dataframe. When use this dataframe with image_url column,
It seems, the inner logic of fast display image, is to render the cell by img src and send them to "repr_html" and render them in html format in Jupyter notebook rather than download the real image.
(and in the yaml config file define the image formatter for display different size images)
as your documentation say, use "to_" prefix methods (such as to_csv to_arrow) and so on, they all drop
the image column, but when use "write" and "read" method, it solely save the "config" (not trigger the truly download function)
This design makes a "lossy transformation" of image, when I want to init a Huggingface dataset from your dataframe rapidly, it is not convenient. (e.x. Dataset(df.to_arrow()) )
I think you should add a trigger for truly download the image of the image column and wrap it by a timeout
decorator (you already have _write_empty_image defination) add this function may be easy.
The text was updated successfully, but these errors were encountered:
I review the construction of this project about dataframe. When use this dataframe with image_url column,
It seems, the inner logic of fast display image, is to render the cell by img src and send them to "repr_html" and render them in html format in Jupyter notebook rather than download the real image.
(and in the yaml config file define the image formatter for display different size images)
as your documentation say, use "to_" prefix methods (such as to_csv to_arrow) and so on, they all drop
the image column, but when use "write" and "read" method, it solely save the "config" (not trigger the truly download function)
This design makes a "lossy transformation" of image, when I want to init a Huggingface dataset from your dataframe rapidly, it is not convenient. (e.x. Dataset(df.to_arrow()) )
I think you should add a trigger for truly download the image of the image column and wrap it by a timeout
decorator (you already have _write_empty_image defination) add this function may be easy.
The text was updated successfully, but these errors were encountered: