Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GlobaLeaks DB by removing all "Data Blob", making them sql structures #573

Closed
fpietrosanti opened this issue Sep 1, 2013 · 11 comments

Comments

@fpietrosanti
Copy link
Contributor

It has been noticed by @evilaliv3 that in the GlobaLeaks DB there are certain "data blob" (like the one containing "preview") containing some kind of data that require to be parsed.

This is a not efficient design to the database that lead to many software complexity issues and bug.

This ticket is to Improve GlobaLeaks DB by removing all "Data Blob", making them proper sql structures.

@evilaliv3
Copy link
Member

@fpietrosanti this is somehow a duplicate of #295

by the way in #295 there was the idea of convert the Pickle (a python object) in a JSON Blob. by the way i propose to close the old thicket and marking this new one as the proper solution.

The only efficient, secure and clean solution is to use only SQL fields without aggregated values.

@vecna
Copy link
Contributor

vecna commented Sep 2, 2013

how do you would manage, "SQL fields without aggregated values", having a dynamic amount of translated content ?

because the only reason for having dictionary in the DB, is to keep the localized data insert by admin.

@evilaliv3
Copy link
Member

aggregation is only a way of rappresentation and the problem described by this ticket is very common in DB design.

generally one have this problem porting a legacy system to an SQL db, and the procedure to not have an aggregated field representation is simple and very common with best practices; simply there is the need for a table dedicated to fields that is something somehow reasonable due to the imporants of our "Field" concept)

here is an example:
Field: {field_id, language, title, description, whatever}
where the key is represented by the couple {field_id, language}

@vecna
Copy link
Contributor

vecna commented Sep 2, 2013

I understand but I don't know what's the goal here:

Are we gonna to change the Pickle() Objects ? If yes or no. Why ?
Are we gonna to use instead Json data, like defined in #295 ? If yes or no. Why ?
Are we gonna to use language as key, splitting context, receiver, node in two separated tables ?

@evilaliv3
Copy link
Member

ok the following are the answer to your three questions

  1. Are we gonna to change the Pickle() Objects ? If yes or no. Why ?
    it has been advides by the second pentest that as Pickle object are python objects storing them is not really safe as it is open to some threats.

  2. Are we gonna to use instead Json data, like defined in Use JSONVariable instead of Pickle (GL01-00??) #295 ? If yes or no. Why ?
    the suggestion to use json is not something i agree with. infact jsons is a format aggregated and not SQL natural so it leads to problems of conversion during db migration (do you remember for example what a messy code was needed to migrate fields after changin transaltion representation?)

  3. Are we gonna to use language as key, splitting context, receiver, node in two separated tables ?
    if this sounds reasonable this is somehow what a DB technician probably would souggest us.
    as exactly you are thinking this is an example of the traditional use case:

a "Context" table would contain global columns with {id} as key
a "Context_Translation" table would contain translation with {context_id, language} as key

this is the general way to unroll an aggregated data passing from the most messy normal form (1NF) to the second normal form (2NF). [http://en.wikipedia.org/wiki/Database_normalization]

@vecna
Copy link
Contributor

vecna commented Sep 2, 2013

The point three bring a major refactor of the whole unitTest, models, handlers and jobs. can only be done on a three day hackathon.

@evilaliv3
Copy link
Member

yep :) i strongly agree all this things require good collaborative design.

@fpietrosanti
Copy link
Contributor Author

Is this needed for the discussion related to the statistics?

@fpietrosanti
Copy link
Contributor Author

@evilaliv3 What's about this quite old ticket?

@evilaliv3
Copy link
Member

this ticket is not solved (Pickle data structures still exists) and won't be solved in the short term as Json data structures will continue to exists for the moment also in some of the components of the new fields.

Anyway let's keep the ticket open in wishlist

@evilaliv3
Copy link
Member

having addressed #295 that represented a security issue and having changed the data structure for fields implementing a dedicated table i consider this ticket to be closed.

obviously for various reasons we still need some json variable and we should try to reduce their use during time.

@evilaliv3 evilaliv3 modified the milestones: 2014 December, Wishlist Dec 21, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants