You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I tried to load the train/val/test set csv file that I splitted with load_data_from_folder in multimodal_transformers.data, the returned train_dataset/val_dataset/test_dataset will give me a strange length, which is totally not related to the original length of the csv file.
for example, the train_df.shape = (105195,25), while the train_dataset.cat_feats.shape = (131495,38).
For spliting dataset, I tried train_test_split and np.split, but they both gave me the same issue with loading.
But if I followed the exact same code in the notebook for splitting datasets, load_data_from_folder would work well. At the same time, if I modify one column, such as match the number with words from [0,2,0...] to [A,B,A...], it also cannot load in the correct way.
Does anyone have any suggestions?
The text was updated successfully, but these errors were encountered:
Hi I have found out the problem. When pip install multimodal-transformers, somehow the 0.11a0 version was installed instead of the latest one. In 0.11a0, there is a bug in the load_data.py file and it's updated in the newest version: train_df=data_df.iloc[:len_train].
(The old version use df.loc[train_df.index])
So I manually changed the py file and it works now.
Hi there,
When I tried to load the train/val/test set csv file that I splitted with load_data_from_folder in multimodal_transformers.data, the returned train_dataset/val_dataset/test_dataset will give me a strange length, which is totally not related to the original length of the csv file.
for example, the train_df.shape = (105195,25), while the train_dataset.cat_feats.shape = (131495,38).
For spliting dataset, I tried train_test_split and np.split, but they both gave me the same issue with loading.
But if I followed the exact same code in the notebook for splitting datasets, load_data_from_folder would work well. At the same time, if I modify one column, such as match the number with words from [0,2,0...] to [A,B,A...], it also cannot load in the correct way.
Does anyone have any suggestions?
The text was updated successfully, but these errors were encountered: