Skip to content

Trials of pre-trained BERT models for the medical domain in Japanese.

License

Notifications You must be signed in to change notification settings

ou-medinfo/medbertjp

Repository files navigation

Trials of pre-trained BERT models for the medical domain in Japanese

They are designed to be adapted to the Japanese medical domain.
The medical corpora were scraped for academic use from Today's diagnosis and treatment: premium, which consists of 15 digital references for clinicians in Japanese published by IGAKU-SHOIN Ltd..
The general corpora were extracted from a Wikipedia dump file (jawiki-20190901) on https://dumps.wikimedia.org/jawiki/.

Our demonstration models

Requirements

For just using the models:

Usage

Please check code examples of tokenization_example.ipynb, or try to use example_google_colab.ipynb on Google Colab.

Funding

This work was supported by Council for Science, Technology and Innovation (CSTI), cross-ministerial Strategic Innovation Promotion Program (SIP), "Innovative AI Hospital System" (Funding Agency: National Institute of Biomedical Innovation, Health and Nutrition (NIBIOHN)).

Licenses

Creative Commons License
The pretrained models are distributed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
They are freely available for academic purpose or individual research, but restricted for commecial use.

The codes in this repository are licensed under the Apache License, Version2.0.