-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
provide push API for statical information #35
Comments
create #37 first |
Hi, what needs to be done to achieve this ? A new Push API is created (#55) but data is saved as MessageEntry without a harvesting frequence or a lifetime as described above. |
…ources Issues : #205 #206 Server side requirements : loklak/loklak_server#35 loklak/loklak_server#37 loklak/loklak_server#58 (implemented) loklak/loklak_server#59 (implemented)
implemented with api/push/geojson.json |
I have several blocker questions, that are critical to implement the connect service interface :
|
nowhere yet. There must be a new data structure to hold this. At this time, data can just read from that url and if the import shall start again, the api must be called again. That is of course not the target design. It is true that the url must be stored and then the harvesting frequence must be either submitted as well or the frequence must be computed by try.
We already have a mechanism in loklak which is doing a very same thing: the query index. This index stores all words which have been submitted as query, stored the messge frequency and provides a prediction when the next time the a message for the query may appear again. We need something for this for IoT imports as well. Designing such a thing is somehow critical because it's difficult du clean up a messed up data structure later. Therefore I would like to collect some more experience with the API before starting automated imports. From my point of view they can be added later and meanwhile we can help ourself with cron jobs pushing the API again an agin.
One source should of course contain several messages! You may think of duplicate messages (thats your next question about) but that should never be the case for several messages from one import. However, we must take care of it, see bwlow.
I don't exactly understand how the several topics you address (re-harvesting, data types and message duplication) are related, I believe they are unrelated. Howver they should be considered, but not connected:
We could compute a hash from a string consisting of the harvesting url and the location. That hash must be stored into a kind of device hash field in the message or we re-use another field for that, i.e. the link field in the form |
This is just a question about data scheme. The question, in a cleaner format, is : is it wiser to save import source information and list of imported messages, in a new data structure (for e.g. SourceEntry), rather than save import source information inside each imported message. I think you already answer this, and I totally agree - saving in a new data structure is the way to go :
And thank you for the high level of details. This answer definitely helps a lot. |
I think there is a mix-up of two things here:
The new data structure should hold the source url and import metadata, not the source content. The imported content must be adjusted to fit in our message format. How to do that is already answered, you implemented a mapping for this and I suggested to add the source content (i.e. the content of the property object from geojson) as part of the message in the same fashion as rich texts are stored. |
This commit introduces two new features : - save the import profile when pushing custom messages. Currently it is only implemented in /api/push/geojson.json - /api/import.json with source_type parameter to retrieve import profiles list by source_type
Implemented in #83 |
To add more sources than harvested by twitter, we want to add data from other sources including RSS feeds and geoJSON data. These sources must be added to the message index in the context of a lifetime flag #33
The data submitted to the API must therefore include:
The text was updated successfully, but these errors were encountered: