Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructuring code for models repo. #60

Closed
wants to merge 5 commits into from

Conversation

kartikdutt18
Copy link
Member

@kartikdutt18 kartikdutt18 commented Mar 9, 2020

Hi everyone, Here is the list of current changes:

  1. Implemented models (DigitRecognizer and LeNet as classes that can be included.)
  2. Added DataLoader (Currently supports only mnist).
  3. An examples folder to for mnist called mint_tutorial.
  4. Mnist dataset can be downloaded using -DDOWNLOAD_MNIST or -DDOWNLOAD_DATASETS
  5. All examples use ensamllen.
  6. Unified Utils folder (Currently same as Kaggle_utils, Need cleaning up before I add LSTM.

Also closes #40, right now it partially handles #61, and aims to resolve #57.
I hope it would be okay to alexnet here as well.
I'll also add support for augmentation, Better utils, weights and test in the next 1 / 2 days. I would really appreciate your feedback.

Future Issues:
Added Augmentation.
Better DataLoader (support for selecting start and end columns)

Hi @zoq, @favre49, if you get a chance could please review this. I'll be making more changes to download weights and tests soon.

@kartikdutt18
Copy link
Member Author

kartikdutt18 commented Mar 9, 2020

I have also tested LeNet4, LeNet1 and simpleNN for 3 epochs each. They worked fine on machine generating 89%, 94% and 92% accuracy respectively on validation dataset.

Some Minor Fixes

LeNet switch

Add-support for downloading datasets

Transfered DigitRecognizer completely
@prince776
Copy link

Hi @kartikdutt18 I think the implementation is really good. I checked the LeNet code and I have one suggestion, that is to add the option of includeTop. I'm unsure if it's needed in LeNet, but it's needed in models like VGG. So it will make models architecture consistent.
Let me know what you think about it.
I'm also looking forward to adding Inception to it.

@kartikdutt18
Copy link
Member Author

Hmm, I think the purpose of LeNet's paper was fully connected layer at the after Convolutions so if I remove it doesn't become a LeNet anymore however I don't mind adding that to be more consistent with the code base.

@kartikdutt18
Copy link
Member Author

Also @prince776, I probably skipped around some comments so if you notice something, could you review it and leave a comment saying that we should add this line here or something along similar lines. Thanks.

@prince776
Copy link

Also @prince776, I probably skipped around some comments so if you notice something, could you review it and leave a comment saying that we should add this line here or something along similar lines. Thanks.

Ok, I'll do that.

Copy link

@prince776 prince776 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment errors mentioned and a minor suggestion. Thanks.

dataloaders/dataloader.hpp Outdated Show resolved Hide resolved
dataloaders/dataloader.hpp Outdated Show resolved Hide resolved
dataloaders/dataloader.hpp Outdated Show resolved Hide resolved
examples/mnist_tutorial/CMakeLists.txt Outdated Show resolved Hide resolved
examples/mnist_tutorial/mnist_tutorial_cnn.cpp Outdated Show resolved Hide resolved
models/lenet/lenet.hpp Outdated Show resolved Hide resolved
models/lenet/lenet.hpp Outdated Show resolved Hide resolved
models/lenet/lenet.hpp Outdated Show resolved Hide resolved
models/simple_nn/simple_nn.hpp Show resolved Hide resolved
utils/CMakeLists.txt Outdated Show resolved Hide resolved
@kartikdutt18
Copy link
Member Author

Hi @prince776, Thanks for the review, I made constant changes to the code without changing comments so this is what we ended up with. Will fix them in the next commit. I hope you could take a look then too. Thanks.

@ojhalakshya
Copy link

Hi @kartikdutt18
Could you add more details about these classes.
Thanks.

@prince776
Copy link

Hi @prince776, Thanks for the review, I made constant changes to the code without changing comments so this is what we ended up with. Will fix them in the next commit. I hope you could take a look then too. Thanks.

Yes sure. I'm really excited to work on it anyways. I'll keep reviewing to make sure I'm up to date with the codebase.

@kartikdutt18
Copy link
Member Author

Great, I don't think I'll be able to push a commit today. I am going with a complete redesign of DataLoader so I'll have to make changes for it accordingly. In the next commit I'll have restricted LSTMs, removed csv files completely from the repo and a better DataLoader.

@kartikdutt18
Copy link
Member Author

Hi @kartikdutt18
Could you add more details about these classes.
Thanks.

Will do. Some of the portion of the code needs to heavily documented. Thanks.

@kartikdutt18
Copy link
Member Author

I haven't tested these changes, they probably require some debugging, but I spent yesterday making a DataLoader that ended up deleting so I made some changes to make progress. I'll post the result tomorrow after the fixes. Hopefully I'll also transfer VAE, and add tests tomorrow.

@kartikdutt18
Copy link
Member Author

kartikdutt18 commented Mar 12, 2020

Adding a todo list here to keep track of changes:

  • Models should be included as libraries / classes (sample can be seen in PR Adding AlexNet #50). So a user can call models/alexnet.hpp and use alexnet as base network for something else.

  • DataLoaders Folder. (Needs to be heavily improved though)

  • A unified samples foder.

  • A testing folder (this will be for contributors) where they will add test to load weights train for 5 epochs and set a reasonable accuracy as baseline. This will ensure addition of valid models as well as streamline addition of models.

  • All data in one folder. I would however prefer if we made changes in cmake to download data rather than storing it in repo to make models repo lighter.

  • Weights folder in models folder where weights of each models is stored.

New Dataloader

Transfered LSTMS

Spacing issue fixed.

Add empty line
@kartikdutt18
Copy link
Member Author

kartikdutt18 commented Mar 12, 2020

Hey @zoq, Could you please review the DataLoader and the overall representation, I wanted to know what would be the appropriate way of implementing a DataLoader such that more datasets can be added without hassle.

Copy link
Member

@jeffin143 jeffin143 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of some question to understand the implementation details

Comment on lines +122 to +126
void MNISTDataLoader();

// Google Stock Prices Dataloader.
void GoogleStockPricesDataloader();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, why 3 different loaders, Can't we make one generic, Sorry I might have missed something

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need at least two data loaders, one for Timeseries and other for datasets like CSV, I am working on implementing a better data loader as mentioned above. If you have any suggestions that would be great. Currently I'm thinking of implementing LoadTrainCSV and LoadTestCV which will be overloaded to support all CSV related datasets.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but can you briefly explain me why do we need at least two Data loaders ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So time series dataset would need a parameter rho (look back) and modify data to hold past values, we would have to store data in cubes where as for other CSV datasets matrices are sufficient. So what I was planning to implement was we could have a LoadTrainCSV overloaded so that we can take rho if needed and store data in cubes. Here There would also be no shuffling of data. From this we can call the other LoadTrainCSV to perform tasks that would be similar. I might end up with 1 data loader also but I think I would know for sure only after I start making the changes.
I am sorry, I know this isn't really coherent but I think after I code it I can much better sense. I'll try to reduce redundant code as soon as possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues, with it :). I am on it totally, I was just narrowing down the implementation.

We can call this v1, and I hope v2 and v3 will be much better 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try my best and hopefully v2 will be much better than what it currently is. 👍

Comment on lines 94 to 95
trainX = train.submat(1, 0, train.n_rows - 1, train.n_cols - 1) / 255.0;
validX = valid.submat(1, 0, valid.n_rows - 1, valid.n_cols - 1) / 255.0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use mlpack internal scaler here, I guess

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, then we would also need to store the scaler and then this would have to be done for all other datasets and that would just reduce flexibility. For Mnist we can use this. For other data loaders I have getters / setters to scale the data so user could call them to Fit and transform the data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if that would reduce flexibility, I guess that will allow user with more features since he can opt in for scaling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I will keep that in mind while making a better data loader. The current data loader allowed me to run some tests since there a lot of changes. I'll push a commit with a better data loader and you could take a look at that.

Copy link
Member

@jeffin143 jeffin143 Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't mind can I work on your branch

ps : I just understood what are actual plan is

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the plan you can refer to this comment. I this is would be a very important PR for all future models that are added as we need to clearly lay down how a model, a dataset is to be added so it also ends up streamlining the flow. Much like mlpack, where we have a clear set of rules that we need to follow to add a feature hopefully through this PR we could achieve the same for this repo. Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, just one thing I need to mention after my last commit, I get the errors of either segfault or boost::failed assert. I would probably need to fix that first, then I could work on the data loader and I would love to get your help if you get a chance. Thanks.

I really liked the idea, It was on my todo list, Something of transfer learning and the whole idea. I will take a look over this coming weekend

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think this would be very useful especially for training large models. We could train without top (Add it outside model class) and store the weights. Then we could use these weights in a larger model like YOLO. This should decrease training epochs required hence reduce training time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something similar to ladder training.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yayy, So I fixed the build, testing on different OS now.

DataSetY &ValidY() { return validY; }

//! Get the Training Dataset.
arma::mat TrainCSVData() const { return trainCSVData; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is it different from the follwing in line 62

//! Modify the Training Dataset.
  DataSetX &TrainX() { return trainX; }```

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohk, so if user decides to store augmented data without splitting it into X and Y, he can access it using this. TrainX and Train Y are useful for classification only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we return them using a join_row(), joining TrainX and TrainY , why a special member ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure it would work for cubes, I can try it though, but yes I think the dataloader needs to be improved a lot. This is currently in a working state only for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take a look

@kartikdutt18
Copy link
Member Author

Hey @jeffin143, any suggestions for CLI?

Copy link

@prince776 prince776 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me. Some minor changes required mentioned.

dataset.n_cols - 1));
}

#endif

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra line needed at end of the file.

arma::cube preds;
model.Predict(dataloader.TestX(), preds);
cout << MSE(preds, dataloader.TestY()) << endl;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra line needed at eof needed.

* @param datasetPath Path or name of dataset.
* @param shuffle whether or not to shuffle the data.
* @param ratio Ratio for train-test split.
* @param dropHeader Drops first row if true.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter is not present in the constructor.


//! Locally stored augmented probability.
double augmentationProbability;
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment like // class DataLoader should be added here after };.

typename DataSetX = arma::mat,
typename DataSetY = arma::mat
>
class DataLoader

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add all these classes within a namespace. like mlpack::dataloaders.

arma::cube preds;
model.Predict(dataloader.TestX(), preds);
cout << MSE(preds, dataloader.TestY()) << endl;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra line at eof needed.

@kartikdutt18
Copy link
Member Author

Thanks @prince776, I will make the changes. Also there is a typo in the data loader that I will fix in the next commit, I spent hours using valgrind to see why it segfaults.

Comment on lines 117 to 120
trainCSVData = dataset.submat(arma::span(),arma::span(0, (1 - ratio) *
dataset.n_cols));
testCSVData = dataset.submat(arma::span(), arma::span((1 - ratio) * dataset.n_cols,
dataset.n_cols - 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data::split() can be used here, without shuffling

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay!! Thanks for the PR. I will hopefully push a new data loader by tonight.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised a #2293

@kartikdutt18
Copy link
Member Author

Hey @jeffin143, sorry to bother you again but do you think the new data loader is better that what it initially was?
I haven't added comments yet, I also have to add support for google-stock-prices dataset but in general does seem like a better idea?

Copy link
Member

@jeffin143 jeffin143 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT

Comment on lines +63 to +100

void LoadTrainCSV(const std::string &datasetPath,
const bool shuffle,
const double ratio = 0.75,
const bool useScaler = false,
const bool dropHeader = false,
const int startInputFeatures = -1,
const int endInputFeatures = -1,
const int startPredictionFeatures = -1,
const int endPredictionFeatures = -1,
const std::vector<std::string> augmentation =
std::vector<std::string>(),
const double augmentationProbability = 0.2);

void LoadTrainCSV(const std::string &datasetPath,
const double ratio = 0.75,
const int rho = 10,
const bool useScaler = false,
const bool dropHeader = false,
const int startInputFeatures = -1,
const int endInputFeatures = -1,
const size_t inputSize = 1,
const size_t outputSize = 1);

void LoadTestCSV(const std::string &datasetPath,
const bool useScaler = false,
const bool dropHeader = false,
const int startInputFeatures = -1,
const int endInputFeatures = -1);

void LoadTestCSV(const std::string &datasetPath,
const int rho = 10,
const bool useScaler = false,
const bool dropHeader = false,
const int startInputFeatures = -1,
const int endInputFeatures = -1,
const size_t inputSize = 1,
const size_t outputSize = 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kartikdutt18 , I just took a small glance, I guess we don't need all these mutiple functions

May be just use LoadCsv() function

And the if user wants he can call this function two times
LoadCSv(input, "mytest") -> For testCSV
LoadCsv(input_train, "Mytrain") -> for trainCSV

or if he isn't having the data in splited format ,then he can call the function to load data and split it too

LoadCsv ( input, "data", split = True ) ForTestTrainCSV

Not sure, This is a vague idea, I am thinking aloud, And need probably to write down , I was currently working on something and hence did dwell much, Will take a look tonight

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that but I would end up adding if conditions which aren't clean but I do get your point and I don't mind doing that. I think I can add another option time_series = true / false To remove remaining overloaded functions.

@kartikdutt18 kartikdutt18 mentioned this pull request Mar 14, 2020
Copy link

@prince776 prince776 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some little problems I found while working with models.

models/lenet/lenet.hpp Show resolved Hide resolved
models/lenet/lenet.hpp Show resolved Hide resolved
@kartikdutt18
Copy link
Member Author

Closing this, It will be shifted to the new repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implementation of Tests? Addition of MobileNet and architecture to load certain datasets.
5 participants