_id
(bson.ObjectId)username
(string,) primary key. ( e.g. "GeertJohan" in "@GeertJohan", indexed)email
(string, optional)avatar
(to be decided, link to GridFS file?)admin
(bool) whether user is administrator or ordinary usercolor
(string) highlighting color chosen by the user
_id
(bson.ObjectId)username
primary key, refers to account.usernametags
([]string) list of tags this user is interested in
technical parameters (for system)
-
_id
(bson.ObjectId) -
uploaderUsername
(string, refers toaccounts.username
) -
uploadFilename
(string) // Filename of the original upload -
uploadGridFilename
(string, refers to location of the orginal document in GridFS) -
uploadDate
(time.Time); date of publication on Nulpunt. content parameters (for people) -
language
(string) // language of the document to help the OCR (default 'nl_NL') -
pageCount
(int) // number of pages -
analyseState
(string) options("uploaded", "started", "completed", "error") -
title
(string) -
summary
(string) -
`category (string) // "Kamerbrief", "Rapport", ...
-
tags
([]string) // These come from the Tags-table -
FOIRequester
(string) // Wobber -
FOIARequest
(string) // Wob-verzoek -
originalDate
(time.Time) // Time of publishing by the gov-ment agency or date of FOI-response. -
source
(string) // "NL - Binnenlandse zaken", "EN - Foreign affairs", "US - Foreign affairs" -
country
(string) // "NL", "EN" -
published
boolean; true: document is visible for users; false: new or not yet processed document -
hits
(int)
_id
(bson.ObjectId)tag
(string)
Note: tags have an ObjectId, these are not for referencing in other collections. Just insert the tag-string into other collections where needed.
_id
(bson.ObjectId)documentId
(bson.ObjectId, refers todocuments._id
)pageNumber
(int, page number)lines
([][]char-object)text
(string); the text in the same order as the lines-attribute, use for search/sharing. Contains ocr-errorshighresWidth
(int) The width (in pixels) for the highres(900dpi) render.highresHeight
(int) The height (in pixels) for the highres(900dpi) render.
x1
(float32, left) in percentage relative to the imagey1
(float32, top) in percentage relative to the imagex2
(float32, bottom) in percentage relative to the imagey2
(float32, right) in percentage relative to the imagec
(string, character)
_id
(bson.ObjectId)documentId
(bson.ObjectId, refers to Documents)annotator
(string)createDate
(time.Time)annotationText
(string)comments
([]comment)locations
([]object) // In future, there could be multiple sections in a single annotation.pageNumber
(int) indexy1
(float32, left) in percentage relative to the imagex1
(float32, top) in percentage relative to the imagex2
(float32, bottom) in percentage relative to the imagey2
(float32, right) in percentage relative to the image
_id
(bson.ObjectId) // needed to do treewalking to get new comments in the right placecommenterUsername
(string, refers toaccounts.username
)createDate
(time.Time)commentText
(string)comments
([]comment) recursion, disabled for first version?? (JANUARI/FEBRUARI)
We're using GridFS to store files.
Filename must be formatted as: uploads/<uploader-username>-<unix-timestamp>-<random-string-10-chars>-<original-filename>
Holds original uploaded file.
Filename must be formatted as: highres/<docIdHex>-<pageNumber>.png
Holds png for each page for given document rendered at 600dpi from pdf file
Filename must be formated as: docviewer-pages/<docIdHex>-<pageNumber>.png
Holds png for each page for any given document resized to a width of 1000 px