-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database region #860
Database region #860
Conversation
pandaDoc readme update htm-community/HTMpandaVis@37b95a4 |
Wow! Great job 👏
nit, but imho the "database" is also not ideal, maybe
looks enough in general, but as is with limits..is there a way how users can define more inputs? And a howto? |
I am not sure... @dkeeney what do you think about this? Or
There is general short description in the header number of 10 is a constant defined in header. Do you think that is a problem? I welcome any suggestions. I was careful to not change much, since i don't want to break structure of regions by making this region too different from others. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good! 💯
I like the tests, examples and all.
- let's consider what's a self-explanatory region's name
- please change some of the small coding style things below
- IF there's a reasonable way to make the num inputs configurable?
- please add some readme how the region is used with the visualization
Overall, it fits great and is approved from me. I'd like @dkeeney have it reviewed too as he's the expert on NAPI.
* the SQLite3 database file. | ||
* | ||
* Inputs can be now scalars (floats). they have names like: | ||
* (dataIn0,dataIn1 ... up to MAX_NUMBER_OF_INPUTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the MAX_NUMBER_OF_INPUTS should probably be defined in the header (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i made that const as you suggest, and also static, but if i use it only in cpp i think it shouldn't be in the header, since the header is aka interface. But maybe i can make it private class member? But that will not be so visible if its constant that can be changed
#include <htm/regions/DatabaseOutRegion.hpp> | ||
#include <htm/utils/Log.hpp> | ||
|
||
#define MAX_NUMBER_OF_INPUTS 10 // maximal number of inputs/scalar streams in the database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we try to avoid defines, if possible. could you use const UInt MAX_NUM... = 10;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...as my prev. comment about "how users can modify this", it is already fine like this
- if users really need more than 10 inputs, they can change it here and recompile the code.
- would it be possible to make the MAX_NUM_OF_INPUTS part of the region's Spec, ie making it easier to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this parameter IS in the Spec. It is the number spec fields to create. The Spec is static, used before the Region is created and therefore this cannot be a parameter passed in at creation.
I personally don't have a problem with #define. But @breznak wants to pull us out of our old C ways and do things more C++ like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I wonder if you need it at all.
These are used to declare the inputs so you can use links with them. I have not thought this through but what if you used one input and used the Fan-In capability of the Link. The links for data for all fields would be connected to the same input. The Fan-In facility will append the buffers together and are passed into the region as one big array.
Don't know....probably would be hard to separate them out into fields inside the region. Probably not a good idea.
The point is there might be another way to solve this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this parameter IS in the Spec. It is the number spec fields to create. The Spec is static, used before the Region is created
so the other way, can we "read" it from the Spec and get it to the field here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I wonder if you need it at all.
These are used to declare the inputs so you can use links with them.
you need it for the fields in the DB (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need it for the fields in the DB (?)
We need to declare inputs so we can get data for each field of the record being written to the database. Ideally the names of those inputs should be the names of the db fields being stored. But since those names can be anything and could be different for each instance of the region, they could not be in the Spec. The Spec must be static and the same for all instances of the region, set at compile time. So that is why he has used generic names in the Spec.
I was trying to think of a way where the names of the inputs which correspond to DB fields, and their type, could be set during region creation if not even at runtime like it is in SQL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to think of a way where the names of the inputs which correspond to DB fields, and their type, could be set during region creation if not even at runtime like it is in SQL.
Hmm yes i am thinking the same way, but unless we define new link datatype like tuple, we can't dynamically add the inputs.. I want to keep this simple. I have no idea now how to implement this better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really cool.
I did not find anything that would cause it not to work. I have a few suggestions ... mostly ideas for future PR's
#include <htm/regions/DatabaseOutRegion.hpp> | ||
#include <htm/utils/Log.hpp> | ||
|
||
#define MAX_NUMBER_OF_INPUTS 10 // maximal number of inputs/scalar streams in the database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this parameter IS in the Spec. It is the number spec fields to create. The Spec is static, used before the Region is created and therefore this cannot be a parameter passed in at creation.
I personally don't have a problem with #define. But @breznak wants to pull us out of our old C ways and do things more C++ like.
insertData("dataStream_" + inp.first, inObj); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still thinking about a DatabaseRegion ... I can see how a similar loop could be added here to iterate the outputs and populate them with data from the database. May be a problem doing both input and output at the same time with the same instance of the region ... but it is something to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes i think it can be used for reading from the database with not much effort, but i see no use-case now. That's also why it would be good to name the region "DatabaseRegion" as you suggest.
#include <htm/regions/DatabaseOutRegion.hpp> | ||
#include <htm/utils/Log.hpp> | ||
|
||
#define MAX_NUMBER_OF_INPUTS 10 // maximal number of inputs/scalar streams in the database |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I wonder if you need it at all.
These are used to declare the inputs so you can use links with them. I have not thought this through but what if you used one input and used the Fan-In capability of the Link. The links for data for all fields would be connected to the same input. The Fan-In facility will append the buffers together and are passed into the region as one big array.
Don't know....probably would be hard to separate them out into fields inside the region. Probably not a good idea.
The point is there might be another way to solve this problem.
Actually I like "DatabaseRegion" with the idea that it might eventually handle both input and output. Along the same line, I have thought many times that the "FileOutputRegion" and "FileInputRegion" should be combined into a FileRegion ... one of those back burner projects. However, I am not going to insist on "DatabaseRegion". |
Yes i am also for the renaming from "DatabaseOutRegion" to "DatabaseRegion" to have it more general, do you agree @breznak ? |
PR htm-community#860 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am impressed. This is very nicely done.
|
||
// Setup data flows between regions | ||
net.link("encoder", "sp_global", "", "", "encoded", "bottomUpIn"); | ||
net.link("sp_global", "tm", "", "", "bottomUpOut", "bottomUpIn"); | ||
net.link("tm", "output", "UniformLink", "", "bottomUpOut", "dataIn"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "UniformLink", or anything in this argument, is ignored. This is a hold-over from the NuPIC code.
I'm joining the celebration 💯 This is great and professional implementation. Thank you @Zbysekz 👏 and both of you for tuning the details. |
Thank you guys for thoughtful review and cooperation 👍 😃 |
Closes #748
Closes htm-community/HTMpandaVis#4
I have named the region "DatabaseOutRegion" instead of PlotRegion -> i think it is more general, since the region really writes data in prescribed format in the database and do nothing with plot in fact.
Unit test was created, it is copy of napi_hello but strip down to bare bones. It writes 20 epochs to the db file, then reads how many rows are in whole database (should be 40 because we are storing raw anomaly & encoder bucket value). Hope this is enough?
Napi_hello is extended by FileOutputRegion, (so the results are stored in csv) as discussed in A new Region for plots in C++ (for C++ and C# apps) #748 (comment)
The DatabaseOutRegion have 10 inputs, called
dataIn0
-dataIn9
user can use how many is needed. For each input table in SQL is created. The table name isdataStream_dataIn0
-dataStream_dataIn9
. Table contains two columns, one is autoincrement iteration, second is the REAL value.