Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Fix bug with symbol loading #9573

Closed
wants to merge 21 commits into from
Closed

Fix bug with symbol loading #9573

wants to merge 21 commits into from

Conversation

dabraude
Copy link
Contributor

Description

Simplified some processes using the C API

Checklist

Essentials

  • [X ] Passed code style checking (make lint)
  • Changes are complete (i.e. I finished coding on this PR)
  • [ N/A ] All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • [X To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Helper functions for using the C API

Comments

Added two functions that wrap existing functionality to make it easier to use in C / C++ applications.

MXInitialise
Readies mxnet for use. While this is possible with some other commands this function makes more sense and does not require unneeded variables.

MXNDArrayLoadListFromMemory
Duplicates MXNDArrayLoad but from a location in memory. This was needed for my work as we bundle models into another file.

MXInitialise
Readies mxnet for use. While this is possible with some other commands this function makes more sense and does not require unneeded variables.

MXNDArrayLoadListFromMemory
Duplicates MXNDArrayLoad but from a location in memory. This was needed for my work as we bundle models into another file.
* This ensures the engine has been initialised and
* the operators have been registered.
*/
MXNET_DLL void MXInitialise();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case is it required to call this method? It sounds like this should be documented widely.

Copy link
Contributor Author

@dabraude dabraude Jan 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under the hood it just calls the list symbols method which registers the operators, but this means the purpose of the call is clearer. The examples all call something that calls the same method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Initialize is spelled with a 'z' :)

@marcoabreu
Copy link
Contributor

marcoabreu commented Jan 26, 2018 via email

@dabraude
Copy link
Contributor Author

with just this

if you comment out the MXInitialise():

`
#include <stdlib.h>
#include <stdio.h>
#include <mxnet/c_api.h>

int main() {

mx_uint out_size, out_name_size;
NDArrayHandle *out_arr;
SymbolHandle sym;
const char ** out_names;
const char * param_fn = "checkpoint-0000.params";
const char * symbol_fn = "symbol.json";

printf("Loading mxnet\n");
MXInitialise();

MXNDArrayLoad(param_fn, &out_size, &out_arr, &out_name_size, &out_names);
MXSymbolCreateFromFile(symbol_fn, &sym);
return 0;

}
`

you will get
Check failed: op != nullptr Operator FullyConnected is not registered

@marcoabreu
Copy link
Contributor

Thanks for the explanation!

@szha @piiswrong is this the correct usage?

* \param out_names the names of returning NDArrays, can be NULL
* \return 0 when success, -1 when failure happens
*/
MXNET_DLL int MXNDArrayLoadListFromMemory(const void *buf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without it it becomes difficult to put all your resources in 1 file. For instance we are a text to speech company and we need to keep all the resources for a voice in one file. One could also resolve the problem by using a file pointer instead of a buffer, but I thought this would be a more general approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it would allow you to use encrypted or compressed formats too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have an API for loading and saving list/dict of NDArrays

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the c_api.h the only function that looked right for working from memory was MXNDArrayLoadFromRawBytes but that seems to only work with 1 array rather than loading a list/dict unless I misunderstood?

Copy link
Contributor Author

@dabraude dabraude Feb 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking over the predict API and there are functions there that can get this data
but it requires loading the list handle, then manually creating NDArrays from each element in the list. Would it not be better to do this in one function in the main api?

Merge branch 'master' of git://github.com/apache/incubator-mxnet
@dabraude
Copy link
Contributor Author

sorry I was away for a long weekend, any other comments or questions?

* This ensures the engine has been initialised and
* the operators have been registered.
*/
MXNET_DLL void MXInitialise();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Initialize is spelled with a 'z' :)

ret->ret_handles.resize(data.size());
for (size_t i = 0; i < data.size(); ++i) {
NDArray *ptr = new NDArray();
*ptr = data[i];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you immediately reassign 'ptr' and leak the newly created NDArray?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valgrind gives no errors, and if you look at MXNDArrayLoad it is the same as how files are loaded

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, it's dereferenced. Not sure how I missed that.

@@ -91,6 +91,12 @@ int MXRandomSeed(int seed) {
API_END();
}

void MXInitialise() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if this isn't called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put in the example code above, but the functions are not registered so if you try and load a symbol from a file before calling it you will get an error:

Check failed: op != nullptr Operator FullyConnected is not registered

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So wouldn't this rather be a task to do when you load a file, rather than having this as a standalone function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at src/c_api/c_predict_api.cc lines 111
There is a bit where all the symbols are registered with MXListAllOpNames
I think it would be better to use a function whose name is clear, which can be inserted whenever it is needed such as in file loading and the prediction API.

@piiswrong
Copy link
Contributor

piiswrong commented Feb 5, 2018

The CAPI is meant to follow the hour glass model. Its like small kernel. If something can be done by composing existing API we shouldn't add a new API for it. It should be done in the frontend.

This API feels like a convenience wrapper rather than core functionality.

@dabraude
Copy link
Contributor Author

dabraude commented Feb 6, 2018

so I tried to implement the equivalent to the load from memory using the predict API and the main API, but in the end I failed. I need to be able to load the main API because it is an LSTM model which the Predict API does not support. I will remove the initialise function and fix the bugs with symbol loading instead but I don't think it is actually possible to fix load a dictionary / list from memory into NDArray handles. Would you prefer me moving the loading function into a different request so the bug can be fixed quicker?

@dabraude dabraude changed the title Added two functions to the C API Fix bug with symbol loading, and added function for loading NDArrays from memory. Feb 6, 2018
@dabraude
Copy link
Contributor Author

Was at a conference, don't know where that build error was from but it seemed to fix itself once I updated to the latest version

@dabraude dabraude changed the title Fix bug with symbol loading, and added function for loading NDArrays from memory. Fix bug with symbol loading Feb 20, 2018
Copy link
Contributor Author

@dabraude dabraude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated and removed extra api feature

@cjolivier01
Copy link
Member

"Approved changes" by me only refer to my comments.

juliusshufan and others added 8 commits February 24, 2018 09:27
… CNN network using the models defined by example/image-classification (apache#9805)

* Enable the reporting of cross-entropy or nll loss value during training

* Set the default value of loss as a '' to avoid a Python runtime issue when loss argument is not set
* Add SoftSign Activation function
* draft

* move to contrib

* rename op

* CR comments

* Update contrib.py

* Update contrib.py

* Update random.py

* update example in the doc

* update example in symbol doc

* CR comments

* update op name

* update op name

* update op name in test

* update test

* Update contrib.py
* Example implementation of the softmax output layer using RTC

* Remove broken NDArrayOp example

* Update README.md of the python custom operator examples
… etc. (apache#8972)

* Profiling enhancements

* Set new dmlc-core commit

* Address CR comments

* Update profiler.py

* lint
* added test output files to gitignore

* corrected indentation
@dabraude
Copy link
Contributor Author

this has fallen way behind the master I'm going to close this pull request and do the two parts separately

@dabraude dabraude closed this Feb 24, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.