Loading Chinese data into PostgreSQL fails #16

malcolmt opened this Issue · 3 comments

Running the migrations runs into problems with 0003_* in the minerva app. There's a duplicate key problem somewhere in the zho data.


This is trickier than it looked at first sight. Some of the characters (words) appear multiple times in the Chinese file -- up to four times, in some cases (光, 行 and 顶 ). There are different meanings each time, so the lesson is introducing more advanced meanings.

We probably need to change our models to accommodate multiple meanings for a single foreign word.


We can not even differentiate by level. 光 occurs 3 times in level 2, and 顶 occurs 2 times in level 2 and 3. 光 can have meanings: "light", "nothing left", and "only; merely" and 顶 can mean "top" and "carry one the head" in level 2. They are simple enough that joining them together would be possible.


Permit multiple entries for a particular foreign word.

Closed by 54ff9fa. After discussion with Matt, we decided not to enforce any
uniqueness constraint here. This required retrofitting a couple of
migrations, but since the code doesn't work for databases that enforce
unique constraints prior to this commit, this won't break any existing

