Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strategy for dealing with schema changes #146

Closed
tvollstaedt opened this issue Oct 9, 2013 · 14 comments
Closed

Strategy for dealing with schema changes #146

tvollstaedt opened this issue Oct 9, 2013 · 14 comments

Comments

@tvollstaedt
Copy link
Contributor

Is there any Yii/automatic/standard way to ensure that all my mongodb instances (e.g. one for each developer) have a schema that don't break the application?

For example, i made a change which introduced a new field for a model. Having no value on this fields causes Yii/PHP to throw a notice, which is not desired. So i have two ways to handle this:

  1. Having a script/migration file which sets the new field of all my record on a (default) value that doesnt break the application (i.e. the application can be sure of that value)
  2. Implementing application logic which ensures that the app will not break on missing values

I initially preferred the second one, since it's the "NoSQL"-way. But as my application grows it will eventually be impossible to track all changes that could ever be made. And if I really do that, where to i draw the line between "expected" db values and "may be empty/changed"?

Wouldn't it be nice to implement some kind of automated migration logic in mongoyii which applies schema/model changes automatically if they occur. Similar to the Yii's internal migration technique.

@Sammaye
Copy link
Owner

Sammaye commented Oct 9, 2013

Can you not just add a new variable declaration to your php model?

public $new_field;

That way PHP won't cry because the field is not there and it acts like a lazy migration script only you don't have to run that command on every row

@tvollstaedt
Copy link
Contributor Author

In this case, yes. In my case i've added a new relation to a model "user" and added in a view: $model->user->name. This throwed a PHP notice since $model->user returns null if the record hasn't any user id set yet. Thats why i thought about a generic way to handle such occurences.

@Sammaye
Copy link
Owner

Sammaye commented Oct 9, 2013

The only way is to make the null value invisible but that would make it harder to judge when the relation does not exist atm it is a simply falsey value, I mean you should be able to safely detect if the relation exists or not before using it by:

if($user=$model->user)

@tvollstaedt
Copy link
Contributor Author

Right, that would be the second way i thought of: make the application be sure of values.

But imagine you are developing your application over 10 years and you have added about 50 new fields/relations/properties to your model. Your view would be like this:

if($model->user && $model->foo && $model->bar && model->baz && ...)

I mean, isn't there any strategy how to handle that? Like saying "im releasing a new version 2.0 where my db schema has to be a way my model defines it --> anyone who uses it needs to make sure he has proper db values set. Or migration files. Or something like that :)

How du you handle this (on the long run)?

@Sammaye
Copy link
Owner

Sammaye commented Oct 10, 2013

how many relations are you thinking of having?

I normally, even on extremely complex websites, have a managable amount of relations etc and if I don't I refactor to make sure I do

@tvollstaedt
Copy link
Contributor Author

It's not only the relations, it's the whole issue. Even when empty fields don't throw any application errors, it is still an empty field which is not a desired state when you are expecting a complete dataset.

I think the best way to handle this is to do both, write the application in a way that i wont break if fields are missing; but when it goes live, i will need some kind of automated migration technique that I still need to examine.

Please let this issue a few days more open, im really curious about other opinions.

@Sammaye
Copy link
Owner

Sammaye commented Oct 10, 2013

How can you get a complete dataset if these fields have no data for those records?

This entire issue does confuse me a little, I understand the relations part cos I have that problem myself but for generic fields I cannot see the problem but OK I'll leave this issue open for a bit.

@tvollstaedt
Copy link
Contributor Author

In RDBMS (Yii in it's raw form), you have the ability to provide migration files when your database schema changes. Within that, you can iterate over old datasets, insert new fields, update old records, create new tables, and so on.

In Mongoyii, we don't need to change any schema, because our database adapts on the application, which is awesome.
But now, when my application grows bigger, I still have the same issues as with classic RDBMS. At some point, i need to update old records, because i add features or change my document structure. I not only need my application to ensure that those changes doesn't break it (e.g. the PHP Notice error), but sometimes I also have to update old records because they need properties which cannot be empty. In Yii i can for example automatically run these migration files when my application gets deployed (or git pulled, or whatever). In mongodb, i'm still looking for a good method to accomplish this task. I don't want to run any console files by hand when need, I just want to have pulled my code and have a perfect database which doesn't break my application. Maybe this is doesn't even a technology, but a strict rule to follow, like: "I added a new field, so i need to: 1. run tests, 2. check for uses...". I'm looking for a best practice.

Maybe i'm overseeing something big here, but isn't this a real life scenario of applications with document related databases? Has no one ever developed a bigger application with mongodb and had to deal with this kind of problems?

@Sammaye
Copy link
Owner

Sammaye commented Oct 10, 2013

It is a real life scenario with SQL because, as you said, the schema is static and applie to every row.

In fact, I am going to divert here and state my best priactices, however, I should state that these are my personal opinions.

I always define the table (schema) fully within the PHP classes, take a sub project I have; 123 fields defined in the User class. It isn't unusual.

At the end of the day it is easier to apply defaults, read and make sense of it if you write the schema into the class instead of using the schemaless abilities of MongoYii, factually, in truth, I only added those abilities for small anon models, they were never ment to be used extensively.

For standards and sanity reasons you should define your schema in your models as you would as though you were accessing the database schema directly with schema techs such as MySQL, PostgreSQL and MSSQL.

This immedately gets rid of the E_NOTICE on class variables and confines it to accessing properties on non-existant relations, this is a more indepth version of what I said above; so that is why I am confused that you need a migration script here actually because if the field is null or non-existant it should just pull null.

@Sammaye
Copy link
Owner

Sammaye commented Oct 10, 2013

Also defining it in the class allows for defaults like 0 or sammaye etc

@tvollstaedt
Copy link
Contributor Author

It's true, that really helps and I already do this, also for the simple reason that other developers should know which attributes they can use in a collection.

Now what would you do if, for example, you had a field $phone which holds a single string:

{
    ...
    phone: '1234567890',
    ...
}

Now one day you saw that users (you have ~5000 records) need to save more than one. They should have 3 diffrent kind of numbers:

{
     ...
     phone: {
          private: '1234567890', 
          work: '0987654321',
          cell: '612346782'
     },
     ...
}

How do you deal with that in your application? In my sense there should be a migration which ports the single string into, lets say, the 'private' subattribute.

@Sammaye
Copy link
Owner

Sammaye commented Oct 11, 2013

Personally I would make a getter function:

getPhone($type){
    if(is_string($this->phone))
        return $this->phone;
    if(isset($this->phone[$type])
       return $this->phone[$type];
}

Something like that

@Sammaye
Copy link
Owner

Sammaye commented Oct 11, 2013

Or I would make it so that the getter returned an array which set the class property so that I can save it as that

@tvollstaedt
Copy link
Contributor Author

OK, i see and thats what I also thougt of. The application can handle both field types and updates to the new document tructure on the fly. I like that, but the downside of this would be the eventually bloated application (just for handling document changes). I believe it would be great to have some slim component/behavior where i can easily (in therms of code) add such structure changes (not sure how this would look like) and it's updateing the record automatically as soon as it gets read/used.

I think my summery is this:

  • I need to be sure that my application handle new fields/relations without breaking. I need to check for empty attributes, use default values and write tests to be sure that no part of the app breaks
  • The application needs logic to update a document on demand to a new schema. I still need to think how such a logic should look like (it should not bloat my app), but you showed a good start

Thanks for the discussion :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants