Massive commit of decoupling from mongo and addinng a PanFileDB type#414
Massive commit of decoupling from mongo and addinng a PanFileDB type#414wtgee merged 14 commits intopanoptes:developfrom
Conversation
* PanDB is a factory class that will load either PanMongoDB or PanFileDB * PanFileDB merely writes the json to a flat file * One file per "collection" along with a `current_<collection` for current entries. Note this means each "current" collection is a separate file. * Use `bson.json_util` for serializing. Changes a few things with time because it _can_ store timezone aware info, which we don't use properly in `current_time`. * Try to only rely on one DB fro weakref kep - WIP * Clear both mongo and file types of their "current" entries upon POCS init * Change `include_collection` to `store_permanently` for calls to `insert_current` * Change `use_mongo` to `store_result` where appropriate * Private POCS method `_check_environment` changed to public class method, i.e. `POCS.check_environment` * Tests that use the `db` now parameterize it across db types. Now testing takes almost twice as long! (should deal with this somehow)
|
I think we can use Python's json module, as it supports custom encoder and decoder classes. I'm willing to help with that if you like. |
I can see some advantages to a custom encoder (as mentioned ni #206) but I don't see any immediate need when this already does what we want. |
|
Don't you have to install mongodb to use bson.json_util? |
|
No, But I could still see some use cases for a custom encoder/decoder, just not sure it's a priority. |
|
FWIW, I notice that we don't use bson.json_util for much: Most of the |
…ple-storage-from-mongo-120
|
I get the feeling someone doesn't want us to use We should just have a common encode/decode method in our utils folder that would let us change out the underlying implementation however we want. As for the files listed above, it's not a great example of the usage and the main ones in messaging and database tare the ones that have given us problems precisely because I'm not necessarily tied to For now I'll make a common function that can be called so ti's easier to change in future. |
|
I just don't want to be tied to Mongo. I think I now better understand that loads(default=bson.json_utils.default) provides the hook for serializing. We can leave that as is. BUT, let's put the json code into a single py file, so that we can swap it out easily. |
Fixing some testing scope
Removing timezone info from utils function
…ple-storage-from-mongo-120
Codecov Report
@@ Coverage Diff @@
## develop #414 +/- ##
===========================================
+ Coverage 67.22% 67.68% +0.45%
===========================================
Files 61 62 +1
Lines 5058 5164 +106
Branches 702 719 +17
===========================================
+ Hits 3400 3495 +95
- Misses 1471 1476 +5
- Partials 187 193 +6
Continue to review full report at Codecov.
|
jamessynge
left a comment
There was a problem hiding this comment.
Rename the PR (remove Major WIP) when you're ready for a final review.
| mounts: POCS/resources/mounts | ||
| db: | ||
| name: panoptes | ||
| type: mongo |
There was a problem hiding this comment.
When should we switch to the new type?
There was a problem hiding this comment.
I actually want to add an sqlite type and make that the default. I think the flat file is a good alternative but I'm not sure ideal.
|
|
||
| # Set up a Timer that will wait for the duration of the exposure then copy a dummy FITS file | ||
| # to the specified path and adjust the headers according to the exposure time, type. | ||
| # Set up a Timer that will wait for the duration of the exposure then |
There was a problem hiding this comment.
We keep having little formatting/cleanup changes like these inside of bigger changes. Perhaps after this you'd be willing to do another PR that scrubs the whole repo? No need to remove this change from ths PR.
There was a problem hiding this comment.
Good idea. I've just been doing them as I see them, which does pollute the PR some.
| @@ -290,15 +290,12 @@ def analyze_recent(self): | |||
| self.current_offset_info)) | |||
|
|
|||
| # Update the observation info with the offsets | |||
|
|
||
| is_safe = record['data'].get('safe', False) | ||
| timestamp = record['date'] | ||
| timestamp = record['date'].replace(tzinfo=None) |
There was a problem hiding this comment.
What is that replace tzinfo about?
There was a problem hiding this comment.
Ah, I thought I had made a comment about this. One thing about bson is that it is correctly serializing the datetime and that includes timezone information. However we are not properly using that timezone information in current_time. Basic math between a naive datetime and a timezone-aware datetime cannot be done so here we are just dropping the tzinfo, which reverts to same behaviour as we currently have.
I had a change to current_time(datetime=True) that made it timezone aware based off what timezone is listed in the config file, but that had implications beyond the scope of this PR. Will file an issue about it.
| def test_mongo_objectid(forwarder, sub, pub, config, db): | ||
|
|
||
| db.insert_current('config', {'foo': 'bar'}) | ||
| id0 = db.insert_current('config', {'foo': 'bar'}, store_permanently=False) |
There was a problem hiding this comment.
Is this a test of serialization/deserialization of a mongo specific type?
There was a problem hiding this comment.
It was for a mongo specific ObjectId. With flat files that is a uuid4 id. Will rename test, but fundamentally it is testing whether that is delivered via the messaging system correctly.
| def test_insert_and_no_permanent(db): | ||
| rec = {'test': 'insert'} | ||
| db.insert_current('config', rec, include_collection=False) | ||
| id0 = db.insert_current('config', rec, store_permanently=False) |
There was a problem hiding this comment.
I should asked earlier: what does store_permanently mean?
There was a problem hiding this comment.
It replaces include_collection and basically means that it stores it in the permanent collection and not just in the current collection.
| config=config, | ||
| simulator=['all'], | ||
| ignore_local_config=True, | ||
| db_name='panoptes_testing', |
There was a problem hiding this comment.
I think config provides this db name.
| pocs = POCS(observatory, | ||
| run_once=True, | ||
| config=config, | ||
| db_name='panoptes_testing', |
| pocs.power_down() | ||
|
|
||
|
|
||
| def test_run_interrupt_with_reschedule_of_target(observatory): |
There was a problem hiding this comment.
Probably very worthwhile, but is this related?
There was a problem hiding this comment.
Hm, not sure why this is in this PR. 😕
|
I skimmed the latest 5 commits, which look reasonable and pass travis. Is this still WIP? |
|
I'm not too sure about the test_pocs.py as it now runs almost double the tests so I was going to look into changing that. I removed a few of the But yes, I think ready for checking and I'd like to get on PAN008 soon since I can't have mongo. |
|
No mongo because it is a Raspberry Pi? |
Correct. You can get it on there it is just a pain to get a recent version. FWIW, I checked out this branch on PAN008 last night and it seemed to work well. It was cloudy and I was mostly doing alignment so didn't run through full observation run but the basics worked ok. This PR doesn't handle any kind of cleanup/maintenance of the files it creates. I'm also not sure the |
current_<collectionfor currententries. Note this means each "current" collection is a separate file.
bson.json_utilfor serializing. Changes a few things with timebecause it can store timezone aware info, which we don't use properly
in
current_time.include_collectiontostore_permanentlyfor calls toinsert_currentuse_mongotostore_resultwhere appropriate_check_environmentchanged to public class method, i.e.POCS.check_environmentdbnow parameterize it across db types. Now testingtakes almost twice as long! (should deal with this somehow)