Skip to content

Conversation

@mbertrand
Copy link
Member

@mbertrand mbertrand commented Oct 31, 2023

What are the relevant tickets?

Closes #138

Description (What does it do?)

  • Adds a course_numbers JSONField to the Course model, to be populated in the format:
    {
      "value": "12.01",
      "listing_type": <"Primary" or "Cross-linked">,
      "department: {
          "department_id": "12",
          "name": "Earth, Atmospheric and Planetary Sciences"
      }
    }
    
  • Removes the Course.extra_course_numbers field.
  • Adds migrations to update the schema and convert existing courses to have populated values for course_numbers
  • Updates the serializers and ETL pipelines as necessary to support the above.
  • adds additional fields that are needed for indexing, in this case, modifying course.course_numbers to include ``sorted_coursenumandprimary`.
  • Modified the search api to exclude from search results any additional fields defined above, via SOURCE_EXCLUDED_FIELDS constant (which will need to be updated as necessary).

How can this be tested?

  • Run docker compose up. The migrations should run, and afterward, if you have any existing courses, the API should return results with each course resource having a populated {"course": {"course_numbers"}} section.
  • Run ./manage.py recreate_index --all
  • All the ETL pipelines should still work without errors.
  • Import an OCW course with extra course numbers if you don't already have any:
    ./manage.py backpopulate_ocw_data --course-name 4-401-environmental-technologies-in-buildings-fall-2018
    
  • Search for that course by both the primary and extra course number, it should be returned in both searches:
  • You should not see "sorted_coursenum" or "primary" fields in the course.course_numbers response
  • The primary value for this course should be "4.401" and not "4.401+fall_2018" like the readable_id
  • Modify this code so that SOURCE_EXCLUDED_FIELDS=[]
  • Try the above queries again, results should be the same but now sorted_coursenum and primary fields should be shown.

@codecov
Copy link

codecov bot commented Oct 31, 2023

Codecov Report

Merging #173 (39acf65) into main (fbab3b0) will increase coverage by 17.25%.
Report is 2 commits behind head on main.
The diff coverage is 95.31%.

@@             Coverage Diff             @@
##             main     #173       +/-   ##
===========================================
+ Coverage   60.78%   78.04%   +17.25%     
===========================================
  Files         134      307      +173     
  Lines        5036    13248     +8212     
  Branches      905     2161     +1256     
===========================================
+ Hits         3061    10339     +7278     
- Misses       1876     2645      +769     
- Partials       99      264      +165     
Files Coverage Δ
frontends/api/src/generated/api.ts 17.71% <ø> (ø)
.../api/src/test-utils/factories/learningResources.ts 91.35% <100.00%> (ø)
learning_resources/etl/constants.py 100.00% <100.00%> (ø)
learning_resources/etl/mitxonline.py 100.00% <ø> (ø)
learning_resources/etl/ocw.py 91.73% <100.00%> (ø)
learning_resources/etl/openedx.py 97.26% <100.00%> (ø)
learning_resources/etl/prolearn.py 92.42% <ø> (ø)
learning_resources/etl/xpro.py 100.00% <100.00%> (ø)
learning_resources/models.py 93.75% <100.00%> (ø)
learning_resources_search/api.py 88.75% <100.00%> (ø)
... and 5 more

... and 162 files with indirect coverage changes

"""
data = super().to_representation(instance)
if instance.resource_type == LearningResourceType.course.name:
add_extra_course_number_fields(data["course"]["course_numbers"])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The document structure in opensearch will look like this, (with primary and sorted_coursenum excluded from hit results output):

            "course": {
                "course_numbers": [
                    {
                        "value": "4.401",
                        "department": {
                            "name": "Architecture",
                            "department_id": "4"
                        },
                        "listing_type": "primary",
                        "primary": true,
                        "sort_coursenum": "04.401"
                    },
                    {
                        "value": "4.464",
                        "department": {
                            "name": "Architecture",
                            "department_id": "4"
                        },
                        "listing_type": "cross-listed",
                        "primary": false,
                        "sort_coursenum": "04.464"
                    }
                ]
            },

@abeglova is this a suitable replacement for department_course_numbers?

@mbertrand mbertrand added Needs Review An open Pull Request that is ready for review and removed Work in Progress labels Nov 2, 2023
@mbertrand mbertrand requested a review from abeglova November 2, 2023 17:19
@abeglova abeglova self-assigned this Nov 2, 2023
@mbertrand mbertrand merged commit d0ac934 into main Nov 3, 2023
@mbertrand mbertrand deleted the mb/coursenums branch January 16, 2024 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs Review An open Pull Request that is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor course number fields for Course model, serializer, search index

3 participants