# LLMs

|No|Model|GPU Mem|
|---|---|---|
|1|[bloopai/mAInframer-7b](#bloopai/mAInframer-7b)|22GB|
|2|[ibm-granite/granite-3b-code-instruct-128k](#ibm-granite/granite-3b-code-instruct-128k)|14GB|
|3|[bigcode/starcoder2-3b](#bigcode/starcoder2-3b)|13GB|
|4|[meta-llama/Llama-3.2-3B-Instruct](#meta-llama/Llama-3.2-3B-Instruct)|11GB|

**Note**: kill process: `sudo kill -9 <pid>`

## meta-llama/Llama-3.2-3B-Instruct

In [1]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

import torch
# Check if a GPU is available
if torch.cuda.is_available():
    # Get the current device index (default is 0 if no other device is specified)
    current_device = torch.cuda.current_device()
    
    # Get the name of the GPU at this device index
    gpu_name = torch.cuda.get_device_name(current_device)
    print(f"Current GPU: {gpu_name}")
else:
    print("No GPU available.")

Current GPU: Tesla P40


In [7]:
import torch
from transformers import pipeline

model_id = "meta-llama/Llama-3.2-3B-Instruct"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.float32,
    device_map="auto",
)
messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Some parameters are on the meta device because they were offloaded to the cpu.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


{'role': 'assistant', 'content': "Yer lookin' fer a swashbucklin' chatbot, eh? Alright then, matey, I be Captain Codswallop, the scurviest pirate bot to ever sail the Seven Seas... er, I mean, the internet! Me and me trusty parrot sidekick, Polly, be here to share tales o' adventure, answer yer questions, and maybe even teach ye a thing or two about the high seas... or at least, about pirate-y things! So hoist the colors, me hearty, and let's set sail fer a treasure-filled conversation!"}


**Code generation**

In [9]:
%%time

prompt = '''Write a code to find the maximum value in a list of numbers in COBOL'''

messages = [
    {"role": "system", "content": "You mainframe programming expert"},
    {"role": "user", "content": prompt},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1]['content'])

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Here's an example of how you can find the maximum value in a list of numbers in COBOL:

```cobol
IDENTIFICATION DIVISION.
PROGRAM-ID. MAXIMUM-VALUE.

DATA DIVISION.
WORKING-STORAGE SECTION.
01  MAX-VALUE         PIC 9(10).
01  NUMBERS           OCCURS 10 TIMES.
02  NUM-VALUE          PIC 9(10).

PROCEDURE DIVISION.
MAIN-PROGRAM.
    MOVE 0 TO MAX-VALUE
    PERFORM 10 TIMES
        MOVE 0 TO NUM-VALUE
        ACCEPT NUM-VALUE
        IF MAX-VALUE < NUM-VALUE
            MOVE NUM-VALUE TO MAX-VALUE
        END-IF
    END-PERFORM
    DISPLAY "Maximum value: " MAX-VALUE
    STOP RUN.
```

However, the above code is not very efficient as it has to iterate over the entire list of numbers to find the maximum value.

A more efficient way to do it would be to use the `MAX` function in COBOL, if your COBOL compiler supports it:

```cobol
IDENTIFICATION DIVISION.
PROGRAM-ID. MAXIMUM-VALUE.

DATA DIVISION.
WORKING-STORAGE SECTION.

CPU times: user 3min 23s, sys: 5.77 s, total: 3min 29s
Wall time: 

**Code understanding**

In [10]:
%%time

prompt = '''Explain the code:
            IDENTIFICATION DIVISION.
            PROGRAM-ID. EXAMPLE.
            DATA DIVISION.
            WORKING-STORAGE SECTION.
            77  NUM         PICTURE 99.
            77  QUOTIENT    PICTURE 99.
            77  REMAIN      PICTURE 9.
            PROCEDURE DIVISION.
            ACCEPT NUM.
            DIVIDE NUM BY 2 GIVING QUOTIENT REMAINDER REMAIN.
            IF REMAIN = 0
                   DISPLAY NUM ' IS EVEN'
            ELSE
                   DISPLAY  NUM ' IS ODD'
            END-IF.
            STOP RUN.'''

messages = [
    {"role": "system", "content": "You mainframe programming expert"},
    {"role": "user", "content": prompt},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1]['content'])

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


**Mainframe COBOL Program: Dividing a Number and Checking if it's Even or Odd**

This is a simple COBOL program that takes an integer input from the user, divides it by 2, and checks if the remainder is 0 (i.e., if the number is even or odd).

### Breakdown of the Code:

#### IDENTIFICATION DIVISION:
This section contains metadata about the program, such as its name and purpose.

```cobol
IDENTIFICATION DIVISION.
PROGRAM-ID. EXAMPLE.
```

#### DATA DIVISION:
This section defines the data structures used in the program.

```cobol
DATA DIVISION.
WORKING-STORAGE SECTION.
```

The `WORKING-STORAGE SECTION` is used to declare variables that are used only within the program. In this case, we have three variables:

* `NUM`: an integer variable to store the input number
* `QUOTIENT`: an integer variable to store the result of the division
* `REMAIN`: an integer variable to store the remainder of the division

```cobol
77  NUM         PICTURE 99.
77  QUOTIENT    PICTURE 99.
77  REMAIN      PICT

**Code migration**

In [11]:
%%time

prompt = '''
Convert the given COBOL code into Java
IDENTIFICATION DIVISION.
PROGRAM-ID. EXAMPLE.
DATA DIVISION.
WORKING-STORAGE SECTION.
77  NUM         PICTURE 99.
77  QUOTIENT    PICTURE 99.
77  REMAIN      PICTURE 9.
PROCEDURE DIVISION.
ACCEPT NUM.
DIVIDE NUM BY 2 GIVING QUOTIENT REMAINDER REMAIN.
IF REMAIN = 0
       DISPLAY NUM ' IS EVEN'
ELSE
       DISPLAY  NUM ' IS ODD'
END-IF.
STOP RUN.
'''

messages = [
    {"role": "system", "content": "You mainframe programming expert"},
    {"role": "user", "content": prompt},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1]['content'])

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Here's the equivalent Java code for the provided COBOL program:

```java
import java.util.Scanner;

public class Example {
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in);

        System.out.print("Enter a number: ");
        int num = scanner.nextInt();

        int quotient = num / 2;
        int remainder = num % 2;

        if (remainder == 0) {
            System.out.println(num + " is even");
        } else {
            System.out.println(num + " is odd");
        }

        scanner.close();
    }
}
```

Here's a breakdown of the conversion:

- The `IDENTIFICATION DIVISION` is not needed in Java, as it's used to define the program's identification, which is not necessary in this simple program.
- The `DATA DIVISION` and `WORKING-STORAGE SECTION` are also not needed in Java, as they are used to define data and working storage variables, respectively. Instead, we use instance variables or method parameters to store and manipulate data

**Support Japanese**

In [14]:
%%time

prompt = "リスト内の数値の最大値を求めるコードをCOBOLで書いてください。"

messages = [
    {"role": "system", "content": "You mainframe programming expert"},
    {"role": "user", "content": prompt},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)

print(outputs[0]["generated_text"][-1]['content'])

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


以下はCOBOLでリスト内の数値の最大値を求めるコードです。

```cobol
IDENTIFICATION DIVISION.
PROGRAM-ID. MAX-VALUE-SEARCH.

DATA DIVISION.
WORKING-STORAGE SECTION.
01  MAX-VALUE         PIC 9(10).
01  NUMBERS           OCCURS 10 TIMES.
02  NUMBER            PIC 9(10).

PROCEDURE DIVISION.
MAIN-PROGRAM.
    MOVE 0 TO MAX-VALUE
    PERFORM 10 TIMES
        DISPLAY "数字を入力してください"
        ACCEPT NUMBER
        ADD NUMBER TO MAX-VALUE
    END-PERFORM
    DISPLAY "最大値は"
    DISPLAY MAX-VALUE
    STOP RUN.
```

このコードでは、10 個の数字をリスト内に入力し、それらの最大値を計算します。 

このコードは、リスト内の最大値を計算するには、リスト内の数字の最大値がすでに計算されている場合にのみ、数値を計算する必要があるため、リスト内の数字を計算するための外部コンテキストを提供しません。 

このコードは、COBOL 85の仕様に基づいています。COBOL
CPU times: user 3min 24s, sys: 5.79 s, total: 3min 29s
Wall time: 3min 29s


## bigcode/starcoder2-3b

In [1]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

import torch
# Check if a GPU is available
if torch.cuda.is_available():
    # Get the current device index (default is 0 if no other device is specified)
    current_device = torch.cuda.current_device()
    
    # Get the name of the GPU at this device index
    gpu_name = torch.cuda.get_device_name(current_device)
    print(f"Current GPU: {gpu_name}")
else:
    print("No GPU available.")

The history saving thread hit an unexpected error (OperationalError('attempt to write a readonly database')).History will not be written to the database.
Current GPU: Tesla P40


In [2]:
# pip install git+https://github.com/huggingface/transformers.git # TODO: merge PR to main
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "bigcode/starcoder2-3b"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# for multiple GPUs install accelerate and do `model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto")`
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(device)
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


def print_hello_world():
    print("Hello World")

def print_hello_


**Code generation**

In [3]:
%%time

prompt = '''Write a code to find the maximum value in a list of numbers in COBOL'''

inputs = tokenizer.encode(prompt, return_tensors="pt").to(device)
outputs = model.generate(inputs,max_new_tokens = 250)
print(tokenizer.decode(outputs[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


Write a code to find the maximum value in a list of numbers in COBOL.

```
01  LIST-OF-NUMBERS.
   05  NUMBERS-LIST.
       05  NUMBER-1 PIC 9(4).
       05  NUMBER-2 PIC 9(4).
       05  NUMBER-3 PIC 9(4).
       05  NUMBER-4 PIC 9(4).
       05  NUMBER-5 PIC 9(4).
       05  NUMBER-6 PIC 9(4).
       05  NUMBER-7 PIC 9(4).
       05  NUMBER-8 PIC 9(4).
       05  NUMBER-9 PIC 9(4).
       05  NUMBER-10 PIC 9(4).

```

I have tried the following code but it is not working.

```
01  MAX-NUMBER.
   05  MAX-NUMBER-VALUE PIC 9(4).

```

```
MOVE 0 TO MAX-NUMBER-VALUE.
MOVE NUMBER-1 TO MAX-NUMBER-VALUE.
IF MAX-NUMBER-VALUE < NUMBER-2 THEN
CPU times: user 12.1 s, sys: 0 ns, total: 12.1 s
Wall time: 12.1 s


**Code understanding**

In [4]:
%%time
prompt = '''Explain the code:
            IDENTIFICATION DIVISION.
            PROGRAM-ID. EXAMPLE.
            DATA DIVISION.
            WORKING-STORAGE SECTION.
            77  NUM         PICTURE 99.
            77  QUOTIENT    PICTURE 99.
            77  REMAIN      PICTURE 9.
            PROCEDURE DIVISION.
            ACCEPT NUM.
            DIVIDE NUM BY 2 GIVING QUOTIENT REMAINDER REMAIN.
            IF REMAIN = 0
                   DISPLAY NUM ' IS EVEN'
            ELSE
                   DISPLAY  NUM ' IS ODD'
            END-IF.
            STOP RUN.'''

inputs = tokenizer.encode(prompt, return_tensors="pt").to(device)
outputs = model.generate(inputs,max_new_tokens = 250)
print(tokenizer.decode(outputs[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


Explain the code:
            IDENTIFICATION DIVISION.
            PROGRAM-ID. EXAMPLE.
            DATA DIVISION.
            WORKING-STORAGE SECTION.
            77  NUM         PICTURE 99.
            77  QUOTIENT    PICTURE 99.
            77  REMAIN      PICTURE 9.
            PROCEDURE DIVISION.
            ACCEPT NUM.
            DIVIDE NUM BY 2 GIVING QUOTIENT REMAINDER REMAIN.
            IF REMAIN = 0
                   DISPLAY NUM'IS EVEN'
            ELSE
                   DISPLAY  NUM'IS ODD'
            END-IF.
            STOP RUN.
        </pre>
        <p>
            <a href="https://github.com/michael-whelan/COBOL-Examples/blob/master/Examples/Example-001.cbl">Download Example-001.cbl</a>
        </p>
    </div>
</div>

<div class="row">
    <div class="col-md-12">
        <h2>Example 002</h2>
        <pre>
            IDENTIFICATION DIVISION.
            PROGRAM-ID. EXAMPLE.
            DATA DIVISION.
            WORKING-STORAGE SECTION.
            77  NUM        

In [5]:
%%time
prompt = '''
Convert the given COBOL code into Java
IDENTIFICATION DIVISION.
PROGRAM-ID. EXAMPLE.
DATA DIVISION.
WORKING-STORAGE SECTION.
77  NUM         PICTURE 99.
77  QUOTIENT    PICTURE 99.
77  REMAIN      PICTURE 9.
PROCEDURE DIVISION.
ACCEPT NUM.
DIVIDE NUM BY 2 GIVING QUOTIENT REMAINDER REMAIN.
IF REMAIN = 0
       DISPLAY NUM ' IS EVEN'
ELSE
       DISPLAY  NUM ' IS ODD'
END-IF.
STOP RUN.
'''

inputs = tokenizer.encode(prompt, return_tensors="pt").to(device)
outputs = model.generate(inputs,max_new_tokens = 250)
print(tokenizer.decode(outputs[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.



Convert the given COBOL code into Java
IDENTIFICATION DIVISION.
PROGRAM-ID. EXAMPLE.
DATA DIVISION.
WORKING-STORAGE SECTION.
77  NUM         PICTURE 99.
77  QUOTIENT    PICTURE 99.
77  REMAIN      PICTURE 9.
PROCEDURE DIVISION.
ACCEPT NUM.
DIVIDE NUM BY 2 GIVING QUOTIENT REMAINDER REMAIN.
IF REMAIN = 0
       DISPLAY NUM'IS EVEN'
ELSE
       DISPLAY  NUM'IS ODD'
END-IF.
STOP RUN.
<file_sep>/src/main/java/com/example/cobol/examples/chapter1/Example1_1.java
package com.example.cobol.examples.chapter1;

import com.example.cobol.examples.Example;

public class Example1_1 implements Example {

    @Override
    public String getTitle() {
        return "Example 1.1";
    }

    @Override
    public String getCode() {
        return "IDENTIFICATION DIVISION.\n" +
                "PROGRAM-ID. EXAMPLE.\n" +
                "DATA DIVISION.\n" +
                "WORKING-STORAGE SECTION.\n" +
                "01  NUM        PICTURE 99.\n" +
                "PROCEDURE DIVISION.\n" +
             

In [6]:
%%time
prompt = "リスト内の数値の最大値を求めるコードをCOBOLで書いてください。"

inputs = tokenizer.encode(prompt, return_tensors="pt").to(device)
outputs = model.generate(inputs,max_new_tokens = 250)
print(tokenizer.decode(outputs[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


リスト内の数値の最大値を求めるコードをCOBOLで書いてください。

```
100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
CPU times: user 12.1 s, sys: 0 ns, total: 12.1 s
Wall time: 12.1 s


## ibm-granite/granite-3b-code-instruct-128k

In [1]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

import torch
# Check if a GPU is available
if torch.cuda.is_available():
    # Get the current device index (default is 0 if no other device is specified)
    current_device = torch.cuda.current_device()
    
    # Get the name of the GPU at this device index
    gpu_name = torch.cuda.get_device_name(current_device)
    print(f"Current GPU: {gpu_name}")
else:
    print("No GPU available.")

Current GPU: Tesla P40


In [2]:
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # or "cpu"
model_path = "ibm-granite/granite-3b-code-instruct-128k"
tokenizer = AutoTokenizer.from_pretrained(model_path)

# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()

# change input text as desired
chat = [
    { "role": "user", "content": "Write a code to find the maximum value in a list of numbers." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt")

# transfer tokenized inputs to the device
for i in input_tokens:
    input_tokens[i] = input_tokens[i].to(device)

# generate output tokens
output = model.generate(**input_tokens, max_new_tokens=100)

# decode output tokens into text
output = tokenizer.batch_decode(output)

# loop over the batch to print, in this example the batch size is 1
for i in output:
    print(i)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Question:
Write a code to find the maximum value in a list of numbers.

Answer:
Here's how you can implement this:

```python
def find_max(numbers):
    max_value = numbers[0]
    for num in numbers[1:]:
        if num > max_value:
            max_value = num
    return max_value
```
<|endoftext|>


**Code generation**

In [5]:
%%time

# change input text as desired
chat = [
    { "role": "user", "content": "Write a code to find the maximum value in a list of numbers in COBOL" },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt")

# transfer tokenized inputs to the device
for i in input_tokens:
    input_tokens[i] = input_tokens[i].to(device)

# generate output tokens
output = model.generate(**input_tokens, max_new_tokens=250)

# decode output tokens into text
output = tokenizer.batch_decode(output)

# loop over the batch to print, in this example the batch size is 1
for i in output:
    print(i)

Question:
Write a code to find the maximum value in a list of numbers in COBOL

Answer:
```cobol
 identify maximum-value.
 data division.
 working-storage section.
 77 list-items pic 9(3) values (1, 2, 3, 4, 5, 6, 7, 8, 9, 10).
 77 max-value pic 9(3) value 0.
 procedure division.
 perform maximum-value using list-items.
 stop run.
 maximum-value.
  if max-value < list-items(1) then
    max-value = list-items(1).
  else if max-value < list-items(2) then
    max-value = list-items(2).
  else if max-value < list-items(3) then
    max-value = list-items(3).
  else if max-value < list-items(4) then
    max-value = list-items(4).
  else if max-value < list-items(5) then
    max-value = list-items(5).
  else if max-value < list-items(6) then
    max-value = list-items(6
CPU times: user 12.6 s, sys: 830 ms, total: 13.5 s
Wall time: 13.5 s


**Code understanding**

In [7]:
%%time

# change input text as desired
chat = [
    { "role": "user", "content": """Explain the code:
                        IDENTIFICATION DIVISION.
                        PROGRAM-ID. EXAMPLE.
                        DATA DIVISION.
                        WORKING-STORAGE SECTION.
                        77  NUM         PICTURE 99.
                        77  QUOTIENT    PICTURE 99.
                        77  REMAIN      PICTURE 9.
                        PROCEDURE DIVISION.
                        ACCEPT NUM.
                        DIVIDE NUM BY 2 GIVING QUOTIENT REMAINDER REMAIN.
                        IF REMAIN = 0
                               DISPLAY NUM ' IS EVEN'
                        ELSE
                               DISPLAY  NUM ' IS ODD'
                        END-IF.
                        STOP RUN.""" 
    },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt")

# transfer tokenized inputs to the device
for i in input_tokens:
    input_tokens[i] = input_tokens[i].to(device)

# generate output tokens
output = model.generate(**input_tokens, max_new_tokens=250)

# decode output tokens into text
output = tokenizer.batch_decode(output)

# loop over the batch to print, in this example the batch size is 1
for i in output:
    print(i)

Question:
Explain the code:
                        IDENTIFICATION DIVISION.
                        PROGRAM-ID. EXAMPLE.
                        DATA DIVISION.
                        WORKING-STORAGE SECTION.
                        77  NUM         PICTURE 99.
                        77  QUOTIENT    PICTURE 99.
                        77  REMAIN      PICTURE 9.
                        PROCEDURE DIVISION.
                        ACCEPT NUM.
                        DIVIDE NUM BY 2 GIVING QUOTIENT REMAINDER REMAIN.
                        IF REMAIN = 0
                               DISPLAY NUM'IS EVEN'
                        ELSE
                               DISPLAY  NUM'IS ODD'
                        END-IF.
                        STOP RUN.

Answer:
The code provided is a simple program written in COBOL (Commonly Operating
Blockchain Language) that takes an input number and determines whether it is even or odd.

The program begins with the identification division, which contains t

**Code migration**

In [8]:
%%time

# change input text as desired
chat = [
    { "role": "user", "content": """Convert the given COBOL code into Java
                        IDENTIFICATION DIVISION.
                        PROGRAM-ID. EXAMPLE.
                        DATA DIVISION.
                        WORKING-STORAGE SECTION.
                        77  NUM         PICTURE 99.
                        77  QUOTIENT    PICTURE 99.
                        77  REMAIN      PICTURE 9.
                        PROCEDURE DIVISION.
                        ACCEPT NUM.
                        DIVIDE NUM BY 2 GIVING QUOTIENT REMAINDER REMAIN.
                        IF REMAIN = 0
                               DISPLAY NUM ' IS EVEN'
                        ELSE
                               DISPLAY  NUM ' IS ODD'
                        END-IF.
                        STOP RUN.""" 
    },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt")

# transfer tokenized inputs to the device
for i in input_tokens:
    input_tokens[i] = input_tokens[i].to(device)

# generate output tokens
output = model.generate(**input_tokens, max_new_tokens=250)

# decode output tokens into text
output = tokenizer.batch_decode(output)

# loop over the batch to print, in this example the batch size is 1
for i in output:
    print(i)

Question:
Convert the given COBOL code into Java
                        IDENTIFICATION DIVISION.
                        PROGRAM-ID. EXAMPLE.
                        DATA DIVISION.
                        WORKING-STORAGE SECTION.
                        77  NUM         PICTURE 99.
                        77  QUOTIENT    PICTURE 99.
                        77  REMAIN      PICTURE 9.
                        PROCEDURE DIVISION.
                        ACCEPT NUM.
                        DIVIDE NUM BY 2 GIVING QUOTIENT REMAINDER REMAIN.
                        IF REMAIN = 0
                               DISPLAY NUM'IS EVEN'
                        ELSE
                               DISPLAY  NUM'IS ODD'
                        END-IF.
                        STOP RUN.

Answer:
Here is the Java code equivalent to the given COBOL code:

```java
import java.util.Scanner;

public class Example {
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in);
  

**Support Japanese**

In [6]:
%%time

# change input text as desired
chat = [
    { "role": "user", "content": "リスト内の数値の最大値を求めるコードをCOBOLで書いてください。" },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt")

# transfer tokenized inputs to the device
for i in input_tokens:
    input_tokens[i] = input_tokens[i].to(device)

# generate output tokens
output = model.generate(**input_tokens, max_new_tokens=250)

# decode output tokens into text
output = tokenizer.batch_decode(output)

# loop over the batch to print, in this example the batch size is 1
for i in output:
    print(i)

Question:
リスト内の数値の最大値を求めるコードをCOBOLで書いてください。

Answer:
はい、COBOLでmax-valueを求めるためのプログラムを以下に記載します。

```cobol
DATA  MAX-VAL  PIC 9(4) INITIAL 0.
DATA  COUNT    PIC 9(4) INITIAL 0.
DATA  LST      PIC 9(4) INITIAL 0.

READ  LST.
ADD 1 TO COUNT.
IF COUNT = 10
    GO TO 100.
ADD LST TO MAX-VAL.
GO TO 200.

100  DISPLAY MAX-VAL.
999  STOP RUN.

200  READ  LST.
ADD 1 TO COUNT.
IF COUNT = 10
    GO TO 100.
ADD LST TO MAX-VAL.
GO TO 200.
```

上記のprogramは、LIST内の10個の数値の最大値を求めることを目的としています。

programは、MAX-VALとCOUNTを用い、LISTの各要素を加算してMAX
CPU times: user 12.6 s, sys: 865 ms, total: 13.5 s
Wall time: 13.5 s


## bloopai/mAInframer-7b

In [1]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

import torch
# Check if a GPU is available
if torch.cuda.is_available():
    # Get the current device index (default is 0 if no other device is specified)
    current_device = torch.cuda.current_device()
    
    # Get the name of the GPU at this device index
    gpu_name = torch.cuda.get_device_name(current_device)
    print(f"Current GPU: {gpu_name}")
else:
    print("No GPU available.")

Current GPU: Tesla P40


In [2]:
from transformers import AutoModelForCausalLM, AutoTokenizer

# Set environment variable to manage CUDA memory allocations
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128"

# Load model
model = AutoModelForCausalLM.from_pretrained(
    "bloopai/mAInframer-7b",
    device_map="auto",
    torch_dtype="auto",
    low_cpu_mem_usage=True
)

tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")

# Free any unused cached memory
torch.cuda.empty_cache()

Loading checkpoint shards:   0%|          | 0/6 [00:00<?, ?it/s]

Some parameters are on the meta device because they were offloaded to the cpu.


**Code generation**

In [4]:
%%time
prompt = '''       IDENTIFICATION DIVISION.
       PROGRAM-ID.  SUM-OF-CUBES.
       ENVIRONMENT DIVISION.
       
       INPUT-OUTPUT SECTION.

       DATA DIVISION.
       WORKING-STORAGE SECTION.
       
       01 STEP         PIC S9(10).
       01 CUBE         PIC 9(7).
       01 CUBE-SUM     PIC 9(7) VALUE 0.

       LINKAGE SECTION.

       01 LINKED-ITEMS.
           05 L-MAX-STEP PIC S9(10).
           05 RESULT PIC S9(10).

      * 
      * Given an integer number, return the sum of the of all the integers below it.
      * 
      * Example:
      * 
      * sum_of_cubes(3) == 1**3 + 2**3 == 9
      * sum_of_cubes(5) == 100
      *  

      * Store the result in the RESULT variable and mark the end of your program with END PROGRAM
'''

inputs = tokenizer.encode(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")
outputs = model.generate(inputs, max_new_tokens=250, use_cache=True, do_sample=False, repetition_penalty=1.1)
print(tokenizer.decode(outputs[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


       IDENTIFICATION DIVISION.
       PROGRAM-ID.  SUM-OF-CUBES.
       ENVIRONMENT DIVISION.
       
       INPUT-OUTPUT SECTION.

       DATA DIVISION.
       WORKING-STORAGE SECTION.
       
       01 STEP         PIC S9(10).
       01 CUBE         PIC 9(7).
       01 CUBE-SUM     PIC 9(7) VALUE 0.

       LINKAGE SECTION.

       01 LINKED-ITEMS.
           05 L-MAX-STEP PIC S9(10).
           05 RESULT PIC S9(10).

      * 
      * Given an integer number, return the sum of the of all the integers below it.
      * 
      * Example:
      * 
      * sum_of_cubes(3) == 1**3 + 2**3 == 9
      * sum_of_cubes(5) == 100
      *  

      * Store the result in the RESULT variable and mark the end of your program with END PROGRAM

       PROCEDURE DIVISION USING LINKED-ITEMS.
           MOVE L-MAX-STEP TO STEP.
           PERFORM VARYING STEP FROM 1 BY 1 UNTIL STEP > L-MAX-STEP
               COMPUTE CUBE = STEP ** 3
               ADD CUBE TO CUBE-SUM
           END-PERFORM.
           

**Infill**

In [6]:
%%time
prompt = '''<PRE>       IDENTIFICATION DIVISION.
       PROGRAM-ID.  SUM-OF-CUBES.
       ENVIRONMENT DIVISION.
       
       INPUT-OUTPUT SECTION.

       DATA DIVISION.<SUF>

       LINKAGE SECTION.

       01 LINKED-ITEMS.
           05 L-MAX-STEP PIC S9(10).
           05 RESULT PIC S9(10).

      * 
      * Given an integer number, return the sum of the of all the integers below it.
      * 
      * Example:
      * 
      * sum_of_cubes(3) == 1**3 + 2**3 == 9
      * sum_of_cubes(5) == 100
      *  

      * Store the result in the RESULT variable and mark the end of your program with END PROGRAM'''

inputs = tokenizer.encode(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")
outputs = model.generate(inputs, max_new_tokens=250, use_cache=True, do_sample=False, repetition_penalty=1.1)
print(tokenizer.decode(outputs[0]))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


<PRE>       IDENTIFICATION DIVISION.
       PROGRAM-ID.  SUM-OF-CUBES.
       ENVIRONMENT DIVISION.
       
       INPUT-OUTPUT SECTION.

       DATA DIVISION.<SUF>

       LINKAGE SECTION.

       01 LINKED-ITEMS.
           05 L-MAX-STEP PIC S9(10).
           05 RESULT PIC S9(10).

      * 
      * Given an integer number, return the sum of the of all the integers below it.
      * 
      * Example:
      * 
      * sum_of_cubes(3) == 1**3 + 2**3 == 9
      * sum_of_cubes(5) == 100
      *  

      * Store the result in the RESULT variable and mark the end of your program with END PROGRAM

       PROCEDURE DIVISION USING LINKED-ITEMS.
           MOVE L-MAX-STEP TO WS-MAX-STEP.
           PERFORM VARYING WS-COUNTER FROM 1 BY 1 UNTIL WS-COUNTER > WS-MAX-STEP
               COMPUTE WS-SUM = WS-SUM + (WS-COUNTER ** 3)
           END-PERFORM.
           DISPLAY 'THE SUM OF THE CUBES IS ' WS-SUM.
           MOVE WS-SUM TO RESULT.
           GOBACK.
           
       END PROGRAM SUM-OF-CU