-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement script for extracting features #85
Comments
Hi! I'd like to work on this issue. I hope it's not done yet :) |
@churnikov you are welcome! @DaloroAT will provide you with a road map today or tomorrow |
Sounds good :) I've looked through contributions guide and I hope you'll give me details on what you expect :) |
Hey @churnikov ! We can split this script into several parts:
We have no Design
from torch.utils.data import Dataset
import torch
class ListDataset(Dataset):
def __init__(self, filenames_list, transforms, f_imread):
...
def __getitem__(self, idx) -> torch.Tensor:
... Please, don't forget to support different types of transforms. Now we use transforms from Transforms are responsible for augmentations of the original images, resizing and normalizing. TestIterate over the files in mock dataset with dataloader and check shapes of batches. Mock dataset can be found in from oml.const import MOCK_DATASET_PATH
print(MOCK_DATASET_PATH / 'images') If you have no mock dataset locally, use What do you think @AlekseySh ? |
@DaloroAT Agree, let's start by implementing if the task is clear for you, @churnikov ? |
Sounds good :) |
Great! I created a separate issue #205 for this particular task. Let's continue there @churnikov @AlekseySh . We'll come back here when we're done with the dataset. |
Hi! I think, as #205 is done, we can continue with this task 😄 |
Great! Thank you @churnikov Now we have all components to implement that script. Let's assume that our script's goal is to extract images' features and save them on some file.
Place your solution into the I'm not sure about the format of the features file, but ConfigWe need to create and support a config with the parameters: images_folder: ...
dataframe_name: ...
features_file: ...
batch_size: ...
num_workers: ...
transforms:
...
model:
... Features fileThe structure of {
"images_folder": <from config>,
"dataframe_name": <from config>,
"model": <nested dict from config section>,
"transforms": <nested dict from config section>,
"filenames": [file1, fil2, ...],
"features": [vector1, vector2, ...]
TestScenario:
Check example1 and example2 how to run tests with configs. Check list
Would you like to add something @AlekseySh ? |
@DaloroAT I am good.
basically, I implemented a very dirty draft of it by myself and used it in my last experiments :) |
Oh, new PR) Anyway this part of my comment was devoted to format of file for users, who need to extract and save for some purposes |
But if you found use case, we can add that to task |
I use pickle for myself, but it's not language-agnostic. |
Hi @DaloroAT You suggested to discuss docs at that point. |
Hey @churnikov Let's discuss docs. We will add a new section in the big examples section, you can check the structure by link. You can create separate markdown snippet, and place them there. In general, you should highlight the following points:
Start with how you would like to see the manual as a lazy external user, then we can add some details during review. |
we need something like
python extract.py
and YAML config which parametrizes the model and DataLoaderprobably we should store features in
hdf5
format which may be useful for users without knowing pythonThe text was updated successfully, but these errors were encountered: