Skip to content

Commit

Permalink
Update benchmark.md (#1312)
Browse files Browse the repository at this point in the history
Typo fixed
  • Loading branch information
muhammetsnts committed Jul 5, 2024
1 parent 730580f commit 5b282d3
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions docs/en/benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -697,14 +697,14 @@ To ensure a fair comparison of these tools, we enlisted the assistance of human
- [sbiobertresolve_rxnorm_augmented](https://nlp.johnsnowlabs.com/2024/01/17/sbiobertresolve_rxnorm_augmented_en.html): Trained with `sbiobert_base_cased_mli` embeddings.
- [biolordresolve_rxnorm_augmented](https://nlp.johnsnowlabs.com/2024/05/06/biolordresolve_rxnorm_augmented_en.html): Trained with `mpnet_embeddings_biolord_2023_c` embeddings.

- **GPT-4:** *GPT-4 Turbo* and *GPT-4o* models.
- **GPT-4:** *GPT-4 (Turbo)* and *GPT-4o* models.

- **Amazon:** *Amazon Comprehend Medical* service

### Evaluation Notes

- Healthcare NLP returns up to 25 closest results, and Amazon Medical Comprehend returns up to five results, both sorted starting from the closest one. In contrast, the GPT-4 returns only one result, *so its scores are reflected similarly in both charts*.
- Since the performance of GPT-4 Turbo and GPT-4o is almost identical according to the [official announcement](https://community.openai.com/t/announcing-gpt-4o-in-the-api/744700?page=3), and we used both versions for the accuracy calculation. Additionally, the GPT-4 returns **only one result**, which means you will see the same results in both evaluation approaches.
- Since the performance of GPT-4 and GPT-4o is almost identical according to the [official announcement](https://community.openai.com/t/announcing-gpt-4o-in-the-api/744700?page=3), and we used both versions for the accuracy calculation. Additionally, the GPT-4 returns **only one result**, which means you will see the same results in both evaluation approaches.
- Two approaches were adopted for evaluating these tools, given that the model outputs may not precisely match the annotations:
- **Top-3:** Compare the annotations to see if they appear in the first three results.
- **Top-5:** Compare the annotations to see if they appear in the first five results.
Expand All @@ -723,7 +723,7 @@ To ensure a fair comparison of these tools, we enlisted the assistance of human

Since we don't have such a small dataset in real world, we calculated the price of these tools according to 1M clinical notes. 

- *Open AI Pricing:* We created a prompt to achieve better results, which costs $3.476 on GPT-4 and $1.738 GPT-4o model for the 79 documents. This means that for processing **1 million notes, the estimated cost would be $44,000 for the GPT-4 Turbo model** and **$22,000 for the GPT-4o model**.
- *Open AI Pricing:* We created a prompt to achieve better results, which costs $3.476 on GPT-4 and $1.738 GPT-4o model for the 79 documents. This means that for processing **1 million notes, the estimated cost would be $44,000 for the GPT-4** and **$22,000 for the GPT-4o**.

- *Amazon Comprehend Medical Pricing:* According to the price calculator, obtaining RxNorm predictions for **1M documents, with an average of 9,700 characters per document, costs $24,250**.

Expand All @@ -739,7 +739,7 @@ Based on the evaluation results:
If you want to process **1M documents** and extract RxNorm codes for medication entities (*excluding the NER stage*), the total cost:
- With Healthcare NLP is about **$4,500, including the infrastructure costs**.
- **$24,250** with Amazon Comprehend Medical
- **$44,000** with the GPT-4 Turbo and **$22,000** with the GPT-4o.
- **$44,000** with the GPT-4 and **$22,000** with the GPT-4o.

Therefore, **Healthcare NLP is almost 5 times cheaper than its closest alternative**, not to mention the accuracy differences (**Top 3: Healthcare NLP 82.7% vs Amazon 55.8% vs GPT-4 8.9%**).

Expand Down Expand Up @@ -768,7 +768,7 @@ Therefore, **Healthcare NLP is almost 5 times cheaper than its closest alternati
<td>$24,250</td>
</tr>
<tr>
<td>GPT-4 Turbo</td>
<td>GPT-4 (Turbo)</td>
<td>8.9%</td>
<td>8.9%</td>
<td>$44,000</td>
Expand Down

0 comments on commit 5b282d3

Please sign in to comment.