Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cascade data and study on deletion #62

Merged
merged 1 commit into from May 29, 2014
Merged

cascade data and study on deletion #62

merged 1 commit into from May 29, 2014

Conversation

squirrelo
Copy link
Contributor

Makes deletion of the objects easier.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 2243227 on squirrelo:cascade into f880690 on biocore:master.

josenavas added a commit that referenced this pull request May 29, 2014
cascade data and study on deletion
@josenavas josenavas merged commit f5df8bb into qiita-spots:master May 29, 2014
@josenavas josenavas deleted the cascade branch May 29, 2014 23:33
@adamrp
Copy link
Contributor

adamrp commented May 29, 2014

Doesn't this make it a little... TOO easy? I mean, now, if we delete a row from the raw sequence table, it will delete everything that references that raw sequence file, and so on down the line

@josenavas
Copy link
Contributor

Yes, that's the point from the DB view. We should make sure from Python that we are not deleting a public study and or raw data, but if we don't add the cascade, we will need to remove them manually, which is a pain in the ass. Makes sense?

@adamrp
Copy link
Contributor

adamrp commented May 29, 2014

Right, I get what the cascade does, but it just means that if (when) we
need interact with the database through psql, we need to be EXTREMELY
careful, since we could very easily blow away large chunks of the database
accidentally.

On Thu, May 29, 2014 at 5:35 PM, josenavas notifications@github.com wrote:

Yes, that's the point from the DB view. We should make sure from Python
that we are not deleting a public study and or raw data, but if we don't
add the cascade, we will need to remove them manually, which is a pain in
the ass. Makes sense?


Reply to this email directly or view it on GitHub
#62 (comment).

@josenavas
Copy link
Contributor

Yes, I agree, but my impression is that we should rarely access to the DB through psql. If we need to access the DB through psql, that means that we are missing some functionality on Qiita.

@adamrp
Copy link
Contributor

adamrp commented May 29, 2014

Although in theory you may be correct, we currently use SQLDeveloper a whole lot. @teravest and @ElDeveloper, I'd be especially interested to hear your opinions.

@squirrelo
Copy link
Contributor Author

I'd say having to use SQLDeveloper or psql a whole lot is a symptom of the current database's shortcomings. Since we are building from scratch we can avoid these issues (hopefully).

@ElDeveloper
Copy link
Member

I'm divided in this matter; even though I agree that we shouldn't be using the console to interact with a live system the reality is that when things become urgent/pressing you have to do things one way and it's usually by running ad-hoc queries, scripts, etc. This is one of those other things that has to go in the documentation i. e. if you delete a piece of information in X table, it will cascade to delete it from Y as well.


Also I though we were doing two devs reviewing per PR?

@squirrelo
Copy link
Contributor Author

Also, the cascades are made to make sense: if you delete the study, all data and samples associated with it are deleted automatically. If you delete raw data, it deletes all preprocessed and processed data that came from it. The cascades are there to make life easier and to make sure we don't have lingering stubs of data, e.g. processed data for a phantom study that was deleted a while ago.

@adamrp
Copy link
Contributor

adamrp commented May 30, 2014

It seems to me that if we are interacting with the database
programmatically, then it is simple to add extra delete statements into the
code. If we are interacting manually, then the cascades become dangerous. I
think they make sense in some cases, but should be used more sparingly.
IMHO, deleting a record in a basal table should not initiate a cascade of
deletes throughout basically every table in the database.
On May 29, 2014 11:13 PM, "Joshua Shorenstein" notifications@github.com
wrote:

Also, the cascades are made to make sense: if you delete the study, all
data and samples associated with it are deleted automatically. If you
delete raw data, it deletes all preprocessed and processed data that came
from it. The cascades are there to make life easier and to make sure we
don't have lingering stubs of data, e.g. processed data for a phantom study
that was deleted a while ago.


Reply to this email directly or view it on GitHub
#62 (comment).

@wasade
Copy link
Contributor

wasade commented May 30, 2014

+1
On May 30, 2014 7:39 AM, "adamrp" notifications@github.com wrote:

It seems to me that if we are interacting with the database
programmatically, then it is simple to add extra delete statements into
the
code. If we are interacting manually, then the cascades become dangerous.
I
think they make sense in some cases, but should be used more sparingly.
IMHO, deleting a record in a basal table should not initiate a cascade of
deletes throughout basically every table in the database.
On May 29, 2014 11:13 PM, "Joshua Shorenstein" notifications@github.com
wrote:

Also, the cascades are made to make sense: if you delete the study, all
data and samples associated with it are deleted automatically. If you
delete raw data, it deletes all preprocessed and processed data that
came
from it. The cascades are there to make life easier and to make sure we
don't have lingering stubs of data, e.g. processed data for a phantom
study
that was deleted a while ago.


Reply to this email directly or view it on GitHub
#62 (comment).


Reply to this email directly or view it on GitHub
#62 (comment).

@teravest
Copy link
Contributor

I use sqldeveloper almost daily. I would not consider that a short coming of the current database. I would consider that part of dealing with any database. There will be a time when all of the current developers are no longer around and new developers are trying to figure out what we did. And they are most likely going to use something like PGAdmin to figure out the database.

It does seem like having all the cascades could be extremely dangerous and detrimental. If you try and delete a row that has a foreign key in another table you will get a delete error, so if the constraints are set correctly, we shouldn't have to deal with inconsistent data after a delete. And it would be very unfortunate to accidentally delete something you didn't want to delete because of an aggressive cascade policy.

@josenavas
Copy link
Contributor

Thanks for all the comments, I can see that is more dangerous than beneficial. I'm working on a PR to revert this merge, sorry about that! It made sense, but without cascading is more secure.

josenavas added a commit to josenavas/QiiTa that referenced this pull request May 30, 2014
This reverts commit f5df8bb, reversing
changes made to f880690.
squirrelo added a commit that referenced this pull request May 30, 2014
Revert "Merge pull request #62 from squirrelo/cascade"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants