-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PLAT-926] Add employment and institution information to user serializer #8546
[PLAT-926] Add employment and institution information to user serializer #8546
Conversation
Pull Request Test Coverage Report for Build 30530
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thorough tests! I would use jsonschema
for validation, which will take care of a lot of this for you. This involves quite a bit of changes to your validation and probably tests.
api/users/serializers.py
Outdated
) | ||
user.social[key] = val[0] | ||
|
||
def _validate_user_json(self, info_list, required_fields): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend using jsonschema
for most of this validation, instead of individually type checking each field. We use this for validating form information for draft registrations. I doubt jsonschema can validate for everything that you want, for example, end date must be greater that start date, but it works well for required fields, field types, etc. You can also define field dependencies (if endMonth is present, endYear must be present, for example).
You would do something like:
- import jsonschema
- define your schema, started below:
jsonschema.validate(validated_data, schema)
Here's a start on the schema for education data, doesn't have everything:
{
"type": "array",
"items": {
"type": "object",
"properties": {
"degree": {
"type": "string",
},
"startYear": {
"type": "integer",
},
"startMonth": {
"type": "integer"
},
"endMonth": {
"type": "integer"
},
"endYear": {
"type": "integer"
},
"ongoing": {
"type": "boolean"
},
"department": {
"type": "string"
},
"institution": {
"type": "string"
}
},
"required": [
"startYear",
"startMonth",
"ongoing",
"department",
"institution",
"degree"
],
"dependencies": {
"endMonth": ["endYear"],
"endYear": ["endMonth"]
}
}
}
a87ef3e
to
844a624
Compare
844a624
to
9c735bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job adding jsonschema! Just a few remaining comments, mostly around cleanup.
api/users/serializers.py
Outdated
@@ -128,6 +134,44 @@ def profile_image_url(self, user): | |||
size = self.context['request'].query_params.get('profile_image_size') | |||
return user.profile_image_url(size=size) | |||
|
|||
def _update_social(self, social_dict, user): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was pulled from the first part of the update
method? doesn't look like it's being used.
api/users/serializers.py
Outdated
employment = JSONAPIListField(required=False, source='jobs') | ||
employment_required_keys = {'department', 'institution', 'ongoing', 'startMonth', 'startYear', 'title'} | ||
education = JSONAPIListField(required=False, source='schools') | ||
education_required_keys = {'department', 'institution', 'ongoing', 'startMonth', 'startYear', 'degree'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove employment_required_keys
and education_required_keys
since we're using jsonschema.
api/users/education-schema.json
Outdated
"endMonth": { | ||
"type": "integer", | ||
"minimum": 1, | ||
"maximum": 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
}] | ||
|
||
def test_user_put_schools_401(self, app, user_one, url_user_one): | ||
# Tests to make sure institution is not empty string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have this comment in a lot of places where it's not accurate.
api/users/serializers.py
Outdated
raise InvalidModelValueError("For '{}' the field value {}".format(e.path[-1], e.message)) | ||
|
||
raise InvalidModelValueError(e.message) | ||
except jsonschema.SchemaError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catching of the different schema errors.
}, | ||
} | ||
}, expect_errors=True) | ||
assert res.status_code == 401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a test_user_put_schools_403/test_user_put_jobs_403 test for users trying to edit someone else's profile info?
Looks great @Johnetordoff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent use of jsonschema and good test coverage here, @Johnetordoff!
It looks like we could still DRY up the tests quite a bit by creating a mixin since the only difference between most of the jobs and schools tests are the payloads. Something like:
class UserProfileMixin(object):
@pytest.fixture()
def request_payload(self):
raise NotImplementedError
# define other fixtures
def test_user_put_profile_info_200(...):
...
def test_user_put_profile_info_401(...):
...
def test_user_put_profile_info_403(...):
...
def test_user_put_profile_info_validation(...):
...
def test_user_put_profile_info_date_validation(...):
....
class TestUserProfileJobs(UserProfileMixin):
@pytest.fixture()
def request_payload(self):
...
class TestUserProfileSchools(UserProfileMixin):
@pytest.fixture()
def request_payload(self):
...
The advantage here is that shared tests can be added in a single place (the mixin) instead of duplicated for both jobs and schools. Also if a test needs to be changed, it only needs to be changed in one place instead of several.
api/users/employment-schema.json
Outdated
"title" | ||
], | ||
"additionalProperties": false, | ||
"dependencies": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be dependencies for startMonth
and startYear
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they are not necessary because startMonth
and startYear
are both always required no matter what, so they're not really dependencies of each other.
api/users/education-schema.json
Outdated
"degree" | ||
], | ||
"additionalProperties": false, | ||
"dependencies": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be dependencies for startMonth
and startYear
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they are not necessary because startMonth
and startYear
are both always required no matter what, so they're not really dependencies of each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that doesn't match current functionality. I'm able to add an education entry to my profile without a start date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you double check these dependencies with functionality on production? Start year and end year can be entered and saved without start month and end month, but months can't be saved without years. I also think there should be a "endYear": "startYear"
dependency here because an end year can't be saved unless there's a start year.
'type': 'users', | ||
'attributes': { | ||
'full_name': user_one.fullname, | ||
'education': [{'degree': '', |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
}, | ||
} | ||
}, auth=user_one.auth, expect_errors=True) | ||
print res.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rogue print
'type': 'users', | ||
'attributes': { | ||
'full_name': user_one.fullname, | ||
'education': [{'degree': '', |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
api/users/serializers.py
Outdated
@@ -142,6 +172,13 @@ def update(self, instance, validated_data): | |||
detail='{} only accept a list of one single value'. format(key) | |||
) | |||
instance.social[key] = val[0] | |||
elif 'schools' == attr: | |||
self._validate_user_json(value, 'education-schema.json') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to reduce the amount of logic that happens in this method (and better separate logic), you could implement field level validators for schools
and jobs
(each method would just call self._validate_user_json(...)
).
See @erinspace's work in #8551 for an example of this.
api/users/schemas.py
Outdated
|
||
here = os.path.split(os.path.abspath(__file__))[0] | ||
|
||
def from_json(fname): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the use of the helper function. One nitpick: it's a bit of a misnomer. From the name, I'd expect from_json
to take a JSON string as input, but it actually takes a filepath. Perhaps load_jsonschema
would be more appropriate?
Also see my other comment about co-locating all the jsonschema-related code.
'type': 'users', | ||
'attributes': { | ||
'full_name': user_one.fullname, | ||
'education': [{'degree': '', |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
941a1b2
to
57dabbf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few small changes, namely a few more dependencies need to be added to the jsonschemas.
api/users/education-schema.json
Outdated
"degree" | ||
], | ||
"additionalProperties": false, | ||
"dependencies": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that doesn't match current functionality. I'm able to add an education entry to my profile without a start date.
api/users/serializers.py
Outdated
@@ -142,6 +180,10 @@ def update(self, instance, validated_data): | |||
detail='{} only accept a list of one single value'. format(key) | |||
) | |||
instance.social[key] = val[0] | |||
elif 'schools' == attr: | |||
instance.schools = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these assignment statements are necessary anymore. The fields will be set by the setattr(instance, attr, value)
call below.
request_payload['data']['attributes'][request_key] = [{}] | ||
res = app.put_json_api(user_one_url, request_payload, auth=user_one.auth, expect_errors=True) | ||
assert res.status_code == 400 | ||
assert res.json['errors'][0]['detail'] == "u'startYear' is a required property" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned above -- this doesn't match current functionality.
into jobs-and-schools-v2 * 'develop' of https://github.com/CenterForOpenScience/osf.io: (87 commits) Bump version and update changelog fix tests for non-dict based client attributes Update gitlab serializers for fangorn config js Add enforce_csrf waffle switch so that this can be merged earlier Update markdown-it-video with infinite loop fix Fix indentation Improve tests by using user cookie method and faster user creation Tests to ensure API requests with Session auth fail without CSRF protection Travis and Test Speed ups Rename --quick to --all Don't transact to avoid deadlocks Add script to fix poor mapping logic -h/t @caseyrollins Add same enforce_csrf method as drf's SessionAuthentication replace bad reverse migration with warning Use timezone.now() when checking for failed registrations Remove duplicate psyarxiv custom subject. Use fork of celery 4.2.1 with celery/celery#4839 merged Upgrade celery and kombu Set TIME_ZONE in settings add scheme to match UI ...
… into jobs-and-schools-v2 * 'master' of https://github.com/CenterForOpenScience/osf.io: fix mocking for gitlab
into jobs-and-schools-v2 * 'develop' of https://github.com/CenterForOpenScience/osf.io:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just a few more gotchas with the dependencies and validation.
Would you mind also adding a test to confirm that unexpected fields are ignored? i.e. if someone attempts to send something like this:
{
'pizza': 'pepperoni',
'title': '',
'startYear': 1991,
'startMonth': 9,
'endYear': 1992,
'endMonth': 9,
'ongoing': False,
'department': '',
'institution': 'Fake U'
}
then we should throw out the pepperoni pizza.
api/users/serializers.py
Outdated
jsonschema.validate(value, from_json(json_schema)) | ||
except jsonschema.ValidationError as e: | ||
if len(e.path) > 1: | ||
raise InvalidModelValueError("For '{}' the field value {}".format(e.path[-1], e.message)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you post an example of the existing error and why reformatting is necessary/beneficial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing message does not have the name of the field erroring in it. For example if institution field is an empty string, the error is just to '' is too short
. So as you can see the new message indicates the erroring field and what is happening to it.
api/users/education-schema.json
Outdated
}, | ||
"required": [ | ||
"ongoing", | ||
"department", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should department
and degree
be in this list? I'm able to add an education entry to my profile with only "Institution" set.
api/users/education-schema.json
Outdated
"degree" | ||
], | ||
"additionalProperties": false, | ||
"dependencies": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you double check these dependencies with functionality on production? Start year and end year can be entered and saved without start month and end month, but months can't be saved without years. I also think there should be a "endYear": "startYear"
dependency here because an end year can't be saved unless there's a start year.
api/users/employment-schema.json
Outdated
}, | ||
"required": [ | ||
"ongoing", | ||
"department", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above: should department and title be in this list?
api/users/serializers.py
Outdated
def _validate_dates(self, info): | ||
for history in info: | ||
|
||
startDate = datetime.date(history['startYear'], history['startMonth'], 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed above: startMonth
and endMonth
aren't required so these lines will throw KeyError
s
Signed-off-by: John Tordoff <john@cos.io>
api/users/serializers.py
Outdated
@@ -148,13 +148,16 @@ def _validate_user_json(self, value, json_schema): | |||
def _validate_dates(self, info): | |||
for history in info: | |||
|
|||
startDate = datetime.date(history['startYear'], history['startMonth'], 1) | |||
if history.get('startYear'): | |||
startDate = datetime.date(history['startYear'], history['startMonth'], 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be
if history.get('startYear'):
startDate = datetime.date(history['startYear'], history.get('startMonth'), 1)
to prevent KeyErrors.
api/users/serializers.py
Outdated
if (endDate - startDate).days <= 0: | ||
raise InvalidModelValueError(detail='End date must be greater than or equal to the start date.') | ||
elif history.get('endYear') or history.get('endMonth'): | ||
raise InvalidModelValueError(detail='Ongoing positions cannot have end dates.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe adding the following to the json schema will validate this for you:
"dependencies": {
"endMonth": ["endYear"],
"startMonth": ["startYear"],
"endYear": ["startYear"]
},
"anyOf": [
{
"not": {
"properties": {
"ongoing": { "enum": [true] }
},
"required": ["ongoing"]
}
},
{ "required": ["startYear"] }
],
"anyOf": [
{
"not": {
"properties": {
"ongoing": { "enum": [false] }
},
"required": ["ongoing"]
}
},
{ "required": ["endYear"] }
],
"not": {
"properties": {
"ongoing": {"enum": [true]}
},
"required": ["endYear"]
}
The "not" will ensure that "endYear" is not included if "ongoing" is true.
The "anyOf" will ensure that "startYear" is required if "ongoing" is true and that "endYear" is included if "ongoing" is false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hesitant to add this to the schema because:
A) It's hard for an outside observer to immediately know what this is for and I can't comment inside a JSON file to make this clearer.
B) The error message this produces isn't really human readable, getting back to the current message is likely going to include a lot of catching and re-throwing.
But there's definitely a benefit to always relying using the schema, so 🤷♂️ .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I definitely agree that the error messages are way less than ideal. My opinion is that using the schema for validation is a lot nicer than writing our own, and we can document this in a more human-readable format in the dev docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
fe0ad65
to
811b7b7
Compare
ac63940
to
b7d9324
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like each schema just needs an "endYear": "startYear"
dependency and this will be RTM.
# End dates, but no start dates | ||
res = app.put_json_api(user_one_url, end_dates_no_start_dates_payload, auth=user_one.auth) | ||
user_one.reload() | ||
assert res.status_code == 200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api/users/serializers.py
Outdated
@@ -128,6 +132,40 @@ def profile_image_url(self, user): | |||
size = self.context['request'].query_params.get('profile_image_size') | |||
return user.profile_image_url(size=size) | |||
|
|||
def _validate_user_json(self, value, json_schema): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure these should be instance methods--there's no usage of self. How about co-locating all the jsonschema validation functions in your new api/users/schemas.py
module?
api/users/education-schema.json
Outdated
@@ -0,0 +1,62 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we put these in a directory?
api/users/schemas
├── education-schema.json
├── ....schema.json
└── utils.py # contains the functions for using the schemas -- see previous comments
osf/models/validators.py
Outdated
@@ -38,7 +38,7 @@ def validate_year(item): | |||
except ValueError: | |||
raise ValidationValueError('Please enter a valid year.') | |||
else: | |||
if len(item) != 4: | |||
if type(item) == str and len(item) != 4: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there's multiple string types in Python (and the names differ between Python 2 and 3), it's best to do this when checking against a string:
from django.utils.six import string_types
isinstance(thing, string_types)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits mostly re: code organization. Once those are addressed, this is good to merge.
70eaad7
to
633aa46
Compare
Purpose
As part of Emberization we're exposing everything to our V2 API and that includes employment and institution information. This infomation sadly lacks self-esteem and must be constantly validated, so here we are.
Changes
QA Notes
Issue different PUT requests against the user endpoint with various payloads and permissions. Make sure error messages/info sent back is correct.
Documentation
When this is approved and the behavior is confirmed I'll write up dev docs.
Side Effects
None that I know of.
Ticket
https://openscience.atlassian.net/projects/PLAT/issues/PLAT-926