Google Summer Of Code 2019
Anonymisation Through Data Encryption of Sensitive Data in ODT and Text Files in Greek Language
Over the past year, great importance has been attached to information anonymisation from governments all around the world. The GDPR defines pseudonymization and the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information. Although the GDPR has been implemented since 2018 no reliable infrastructure exists in Greece to encrypt sensitive documents. It is therefore necessary to develop a product specifically for users of the Greek language that can safely and promptly anonymize their data in order for it to abide to the GDPR.
I propose the creation of a LibreOffice extension as well as a web GUI that will anonymize information in any legal document given. All sensitive information should be easily anonymized through this open-source tool.
On the subject of the creation of the anonymizer I suggest the following metrics. First of all, given any document the anonymizer should encrypt any greek entity in the file from a standard token vocabulary set. The user will be able to add specific arguments for entities to be anonymized (in addition to the standard ones) and he will be given the option to choose for an additional encryption. I believe that the LibreOffice extension as well as the web GUI should be user-friendly so customizable technologies should be used.
An extended documentation has been written to wiki pages in order the service to be understandable and maintainable.
Improvements in user interface.
Extending Web GUI, so that it can be hosted in VM and serve multiple clients at the same time.
Machine learning techniques to identify sensitive information in text.
Resolving any open issues.
For more information you can visit future work in wiki pages.
Final Report Gist
You can find the final report here.
Google Summer of Code participant: Dimitrios Katsiros
Mentor: Kostas Papadimas
Mentor: Panos Louridas
Mentor: Iraklis Varlamis
This project is open source as a part of the Google Summer of Code Program. Here, the MIT license is adopted. For more information see LICENSE.