<a href="https://colab.research.google.com/github/cahya-wirawan/Bloom/blob/main/bloom_comparison.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Comparison between the new Bloom model (1B parameters) and gpt2-medium-indonesian (340M parameters)

In [1]:
!nvidia-smi

Wed Jul 13 09:30:46 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   37C    P0    27W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [2]:
%%capture

!pip install transformers

In [3]:
from transformers import pipeline, set_seed

Setup the pipeline for text generation of both models

In [4]:
generator_bloom = pipeline('text-generation', model='bigscience/bloom-1b3', device=0)
generator_indonesian_nlp = pipeline('text-generation', model='indonesian-nlp/gpt2-medium-indonesian', device=0)

Set the options for the text generation

In [5]:
kwargs = {
    "max_length": 200,
    "do_sample": True,
    "top_k": 30,
    "top_p": 0.9,
    "temperature": 0.3,
    "repetition_penalty": 2.0    
}

## Indonesian prompt

In [6]:
text = "Berikut ini beberapa penyebab kebanjiran di Jakarta:"
seed = 10

### Bloom 1B model

In [7]:
%%time

set_seed(seed)
result_bloom = generator_bloom(text, **kwargs)
print(f"{result_bloom[0]['generated_text']}\n")

Berikut ini beberapa penyebab kebanjiran di Jakarta:
1. Kebocoran pipa air yang terjadi pada saluran pembuangan, baik itu dari rumah atau apartemen.
2. Air mengalir ke bawah karena adanya atap bocor dan lantai keramik pecah sehingga menyebabkan banjir besar saat hujan deras turun tiba – datang.
3 / 4 orang tidak menggunakan masker ketika berada diluar ruangan, apalagi jika sedang melakukan aktifitas outdoor seperti bermain sepeda motor ataupun naik kendaraan umum ( bus ).
4/5 Orang kurang menjaga kebersihan lingkungan sekitar mereka dengan membuang sampah sembarangan terutama disekitar aliran sungai maupun selokan kecil lainnya

CPU times: user 6.52 s, sys: 49.4 ms, total: 6.57 s
Wall time: 3.44 s


### Indonesian GPT2 340M model

In [8]:
%%time

set_seed(seed)
result_indonesian_nlp = generator_indonesian_nlp(text, pad_token_id=50256, **kwargs)
print(f"{result_indonesian_nlp[0]['generated_text']}\n")

Berikut ini beberapa penyebab kebanjiran di Jakarta:
1. Banjir karena sampah, bukan air hujan yang turun dari langit (hujan buatan) atau banjir kiriman seperti saat musim kemarau lalu?
2. Air sungai meluap ke pemukiman warga akibat pembangunan jalan tol layang dan proyek-proyek lainnya sehingga menyebabkan aliran permukaan tidak lancar; serta tingginya curah hujam pada masa puncak penghujan tahun 2015/2016 kemarin dengan intensitas tinggi selama 3 hari berturut turut yaitu tanggal 1 – 2 Februari 2016 mengakibatkan terjadinya genangan setinggi 30 cm sampai 50cm bahkan ada juga merendam rumah penduduk hingga ketinggian lutut orang dewasa! Hal tersebut diperparah lagi oleh buruknya drainase kota jakarta terutama saluran pembuangan limbah domestik maupun industri besar menjadi salah satu faktor utama pemicu timbulnya bencana alam berupa luapan lumpur panas ataupun robnya tanggul Kali Ciliwung dikarenakan tersumbat nya pori tanah / sedimentasi sebagai dampak adanya kegiatan pengerukan mater

## English prompt

In [9]:
text = "The following are some of the causes of flooding in Jakarta:"
seed = 10

### Bloom 1B model

In [10]:
%%time

set_seed(seed)
result_bloom = generator_bloom(text, **kwargs)
print(f"{result_bloom[0]['generated_text']}\n")

The following are some of the causes of flooding in Jakarta:
1. The lack or poor maintenance and repair works on water supply system, especially for those areas that have been affected by floods.
2. Poor drainage systems which cause runoff to flow into rivers (especially during rainy season).
3.
Poor management practices such as overgrazing land due mainly from illegal logging activities. (Source: BPS DKI)
In order not only reduce flood damage but also improve people's livelihoods through sustainable development strategies based upon local wisdom; it is important to: 1) Develop a comprehensive strategy with integrated approach involving all stakeholders including government agencies involved directly related at each level 2 ) Implement an effective disaster risk reduction program 3 )
Develop appropriate mitigation measures 4 ).
Implement adaptation programs 5 ). Strengthen community resilience 6). Increase awareness about climate change 7). Improve public participation 8).
Improving so

### Indonesian GPT2 340M model

In [11]:
%%time

set_seed(seed)
result_indonesian_nlp = generator_indonesian_nlp(text, pad_token_id=50256, **kwargs)
print(f"{result_indonesian_nlp[0]['generated_text']}\n")

The following are some of the causes of flooding in Jakarta:
1. The land is limited by a large numbers, and it has been too long for another people to use their space on earth as well; 2) Lacked infrastructure (they have no road), but they can not be able at all until now that there was only one way sourced from highway or otherwise transportation system which doubled upon everyone who need themselves within time during peak hours—and this may become more dangeroous if you don't know what your friend's name should put into account! 3). This problem will also affect many others like food stallings outside street area while we're going down township overnight when our familie arrived earlier after nighttime warming day 4)."(http://www.ind

CPU times: user 2.99 s, sys: 9.59 ms, total: 3 s
Wall time: 2.99 s
