Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multilingual findings #8

Closed
noraj opened this issue May 8, 2023 · 12 comments
Closed

Multilingual findings #8

noraj opened this issue May 8, 2023 · 12 comments
Labels
enhancement New feature or request in progress

Comments

@noraj
Copy link

noraj commented May 8, 2023

Description and why

Pentesters from english-speaking countries are maybe the only ones that don't need this feature.

But in other countries you will need to have a finding library in both English and your native language and some countries also have 2, 3 or more official languages.

Very often, in non-english speaking countries, you need to write pentest report in several languages so having a Multilingual vulnerability database is critical for them.

Implementation

It needs a change of the SQL tables.

Instead of having something like

vulns:
  - vuln1:
    title: xxx
    description: xxx
    cvss: xxx
  - vulns2
    title: xxx
    description: xxx
    cvss: xxx

You would have

vulns:
  - vuln1:
    cvss: xxx
    lang:
      - en:
        title: xxx
        description: xxx
      - fr
        title: xxx
        description: xxx
  - vulns2
    cvss: xxx
    lang:
      - en:
        title: xxx
        description: xxx
      - fr
        title: xxx
        description: xxx

Workaround

A common workaround and why it is bad.

A common bad workaround is to add a lang prefix in the title of the vulnerability.

Like [EN] SQL injection and [FR] Injection SQL.

Or here in SysReptor to create the same vulnerability two times with a different value in language field.

image

This is terrible for multiple reasons.

When having multiple languages, only field containing text or sentences need to be translated, all other fields like the CVSS vector, CVE, vulnerability ID, etc. don't need to be translated and can be stored only once in the database.

Also when you edit the vuln in one language if they are not linked, you often forgot to update the vuln in other languages too.

It would also be also possible to filter by language.

And for report you can't ask for vuln.fr.description or vuln.en.description depending on your french or english template.

Demo

It's a big long and hard to explain in details.
I invite your to deploy and test PwnDoc (https://github.com/pwndoc/pwndoc) which is the only pentest report platform I know to have a mutli-lang vuln DB. It's easy to deploy with docker-compose so it won't take long to try it.

Here is what I mean in video, I'm talking about multilingual finding template (the data stored, the db schema, etc.) not the translation of the label on the WebUI (I18N):

pwndoc-2023-01-04_19.47.40.mp4
@aronmolnar
Copy link
Contributor

Yeah thank's, we already added more languages in our dev branch: English, German, Spanish, French, Portuguese, Dutch, Italian, Danish, Polish, Ukrainian, Romanian, Slovak, Slovenian, Greek, Swedish

If anyone else need additional languages, an issue can be opened.

We discussed the feature of multilingual finding templates in the past and for the moment decided to go with having separate finding templates for different languages (as in your screenshot under "Workaround").
It is true that certain fields (like the CVSS score and references) need to be maintained twice in this case, which is not ideal.

However, in our own setup it is currently only:

  • CVSS
  • References
  • Tags

Because most of the fields have to be maintained separately anyway, we decided keep the templates separately for the moment.
However, we think of linking finding templates of different languages (in the database) which should then provide the functionality you want. We might implement this in the future.

@noraj
Copy link
Author

noraj commented May 9, 2023

Maintaining the CVSS, references and tags twice, while annoying, is not the biggest issue.

When the different language versions are not linked, some issues will arise over time:

  • the finding template list or search results will be bloated because the same finding will appear once per language. I saw there was a List findings only matching current language when adding a finding on a project but there is no language filter offered on the template page for example. By the way I didn't see a language field while creating a project so does the project use English by default and there is no way to select the project language for now or is it matching the user interface display language?
  • one will update a description or recommendation based on the project they are currently working on but will forget to reflect the update to other languages. Same for finding template creation.
  • as the title will be different in different languages, one that want to update the different languages version of the same finding template has to remember the keywords used in the different languages (several synonyms could be used when making the translation) to be able to search for the other versions. For example did I translate Use of deprecated components into French to Utilisation de composants obsolètes or Usage de composants non supportés etc. So one will lose time to search for the correct translation when an update is required. On the other hand, if findings are linked, one just have to switch the language to find the other language version.
  • In report templates, imagine you have to provide a report both in french and english. If the findings are linked, I just have to import templates once, then write both the variable fields, my report template will use something like finding[i][lang].description so I just have to change the lang when exporting and export twice and I'll be all good. However, if findings are not linked, I will have two create two separate projects, and import all findings templates once per language and then have the project twice. Or I can create only one project but import finding templates twice (once per lang) but then I have to add a filter in the report template to select finding only if the have the proper language matching the project. In both workaround case I will have to change the CVSS and maybe some tags or custom fields twice.

The earlier this feature is implemented, the better and easier. Because it requires a structural change of the code and the database schema, it means breaking changes. So the more features and more complex the platform becomes, the harder it will be to bring this feature and the more breaks it will induce.

@aronmolnar
Copy link
Contributor

Absolute fair points :)

By the way I didn't see a language field while creating a project so does the project use English by default and there is no way to select the project language for now or is it matching the user interface display language?

Every design has a language too. If you create a project, it inherits the language from the design.
However, you can change the language after creating the project.

image

@MWedl
Copy link
Contributor

MWedl commented May 9, 2023

I saw there was a List findings only matching current language when adding a finding on a project but there is no language filter offered on the template page for example.

In the template list you can filter for the language code in the search field. Internally we use Postgres full text search for template title, tags and language field. Since the language is stored as language code in the DB, you can only search for the code ("en", "de") and not the display label ("English", "German"). We have not added a language filter to the search UI yet.

@aronmolnar
Copy link
Contributor

aronmolnar commented May 9, 2023

We discussed how this feature could be implemented.

Finding templates of different languages would be linked to each other.
If a finding template is open, the user has the options to:

  • use a drop down to switch to a finding template of a different languages
  • create a finding template of a different languages
  • link an existing template

There need to be a list of fields that should be synchronized between linked templates. This can be by default:

  • CVSS
  • Tags
  • References (even though syncing references might be unwanted)
  • OWASP categories

This list of fields could be modified, either via the Django admin interface (=database) or via environment variables. This means that this is an installation-wide setting and does not allow granular control.
It might also be possible to define which fields are synced in the finding field settings of the design. However, there might be conflicts in custom fields when they are defined by multiple designs with different sync settings (currently templates provide fields from all available designs, such that templates work for all designs). This is why we discarded the idea.

If a field that should be synced in an existing template is linked and contains different data (e.g. a different CVSS score), this field will be overwritten by the template that is currently open.

If a finding template is opened and a field that should be synced is updated, this update process overwrites values in linked finding templates (last save wins).

When findings are created from templates, the user has to choose which language to use. The field contents of the selected language are then copied into the finding.
Multi-language templates are only available for managing the template library itself. When writing a report, the has to decide for a language.

@noraj
Copy link
Author

noraj commented May 9, 2023

This list of fields could be modified, either via the Django admin interface (=database) or via environment variables. This means that this is an installation-wide setting and does not allow granular control.
It might also be possible to define which fields are synced in the finding field settings of the design. However, there might be conflicts in custom fields when they are defined by multiple designs with different sync settings (currently templates provide fields from all available designs, such that templates work for all designs). This is why we discarded the idea.

If a field that should be synced in an existing template is linked and contains different data (e.g. a different CVSS score), this field will be overwritten by the template that is currently open.

If a finding template is opened and a field that should be synced is updated, this update process overwrites values in linked finding templates (last save wins).

I didn't dig too much in how this could be implemented. But I know PwnDoc has managed to get multilingual finding as well as custom fields. But rather than the idea of having two vulnerabilities and linking them up and synching them up, I had more the idea of one vulnerability with an unlimited amount of lang properties and having all variables fields depending on it and the constant fields across langs such as cvss or tags being at the root level and not under the lang scope as shown in the implementation idea of my original message #8 (comment). TL;DR: sounds better that the lang dependant fiels would be properties of a unique vulnerability rather than synching several vulnerabilities. There won't be sync issues if there is no sync :)

I don't see the issue for custom fields either in theory (I didn't take a look at the actual db schema), if custom fields formats are defined in a table and have a unique id, the unique id of a custom field can be attached to a finding template, and the value stored in a mapping table, this way custom fields could be attached to a finding template by default on creation by a field could be removed or added on an unitary basis afterward.

@aronmolnar
Copy link
Contributor

This would also be a solution, but would be more complex in implementation.

It might also be more difficult to keep the interface clean for users who don't need that feature.

@aronmolnar
Copy link
Contributor

We once again discussed internally. I maybe misunderstood your suggestion.
One more option that we discussed...

GUI

A finding template can have multiple tabs. A new tab can created for a new language. If a new language is added to an existing finding template, by default, all field values are synced to the new tab. All fields are then read-only (greyed out) and have button "Translate". If this button is clicked, the field gets enabled and the content can be replaced by the user.

Users can dynamically choose which fields to sync, even per language in one template.

Database

Fields of finding templates are internally stored in a JSON structure. Finding templates will have a "parent" template. If a parent template exists, the template is a translation.

Translations will have no JSON fields by default. If a field is not present in the JSON structure, this means that its value will be taken from the parent template. As soon as the user clicks "Translate", the field name is added to the JSON, which means that the field's value is controlled by the translated finding template (this way, fields can also be empty).

This could also mean that "parent" templates can be of any language. We might want to discuss if this is desirable.

@aronmolnar aronmolnar added the enhancement New feature or request label May 10, 2023
@aronmolnar
Copy link
Contributor

In implementation and will be released in July.

@noraj
Copy link
Author

noraj commented Jul 7, 2023

In implementation and will be released in July.

This is super good news 🥳 I can't wait to test this and make a video about it.

@MWedl
Copy link
Contributor

MWedl commented Jul 31, 2023

Multilingual templates are implemented in https://github.com/Syslifters/sysreptor/releases/tag/0.110

We implemented them similar as described in #8 (comment): Templates have a main language and multiple translations. Every translation inherits its fields from the main template and can override language specific fields.

See the template docs for a more detailed description.

@MWedl MWedl closed this as completed Jul 31, 2023
@aronmolnar
Copy link
Contributor

templates_multilanguage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request in progress
Projects
None yet
Development

No branches or pull requests

3 participants