### Creation of Multibody Models

As previously shown, we can also use the GPT4ALL interface for running a local LLM to create a more complex multibody model. Another popular option is Llama.cpp, which also provides a Python [library](https://pypi.org/project/llama-cpp-python/).  
Note: Llama.cpp requires a C compiler when using  
```pip install llama-cpp-python```. 

In [1]:
from gpt4all import GPT4All
from transformers import AutoTokenizer
from huggingface_hub import hf_hub_download
import torch # only used to check if cuda is available
import time
import os

flagSmall = False

if flagSmall: # approx 2.4GB
    repo_id = 'microsoft/phi-3-mini-4k-instruct-gguf'
    filename =  "Phi-3-mini-4k-instruct-q4.gguf"
else: # approx 8.4 GB
    repo_id = "bartowski/phi-4-GGUF"
    filename = "phi-4-Q4_K_S.gguf"
    
modelPath = hf_hub_download(repo_id=repo_id, filename=filename)
try: 
    # configure to support a context length of 16k tokens
    model = GPT4All(modelPath, device='cuda', n_ctx=16384)  
    print('running model on GPU')
except: 
    model = GPT4All(modelPath, device='cpu', n_ctx=16384)
    print('running model on CPU')



running model on GPU


### Providing Context
 
We want to use Exudyn - our in-house multibody simulation tool. Note that the same strategy is also possible with other tools.  
The context is already prepared in a separate file. 


In [2]:
# Open the file in read mode and save its content to the variable 'data'
with open("ContextGeneral.txt", "r") as file:
    # Read all lines into a list
    lines = file.readlines()

# Extract lines from line 30 onwards (index 29 because indexing starts at 0)
myData = lines[31:]
myContext = "".join(myData)

# Print the content from line 30 onwards
print(myContext[:1000] + '...\n')

#In the following, there are examples to create multibody systems in Exudyn and also important notes about the interface.
#NOTE: mbs.Create...(...) calls several functions in the background to create nodes, objects, markers and loads in Exudyn.
#most quantities such as initial or reference positions and velocities are giving as 3D lists [x,y,z] for positions, velocities, ....
#rotations are usually given as rotation matrix (3x3 numpy array); 
#RotationVector2RotationMatrix([rotX, rotY, rotZ]) computes a rotation around the global x,y,z rotation axis
#for working with rigid bodies, note that there is always a local coordinate system in the body, 
#which can be used to define the location of position and orientation of joints, see the examples.
    
#%%++++++++++++++++++++++++++++++++++++++++++++++++++++
#create rigid bodies and mass points with distance constraint and joints
import exudyn as exu
from exudyn.utilities import * #includes itemInterface and rigidBodyUtilities
import exudyn.

Next we need a description of the desired output.  

In [3]:
prompt = 'Using the previous information and the already existing information on Python code Exudyn, ' + \
        ' create a 3-link system of rigid bodies initially aligned along the X-axis with mass 10kg, length 2m and W=H=0.1 m. ' + \
        ' The rigid bodies are exposed to gravity which acts in Y-direction and the first link is attached to ground with a revolute joint at the left end. '  


In [4]:
strTask = '{}\n\n {}'.format(myContext, prompt)

# print('model path: ', modelPath, 'length: ', len(modelPath))
tokenizer = AutoTokenizer.from_pretrained('microsoft/phi-4')
# Tokenize the prompt
tokens = tokenizer.encode(strTask, add_special_tokens=False)
num_tokens = len(tokens)

## Print the tokens and their count
print("Number of tokens:", num_tokens)
print("Tokens:", tokens)





Number of tokens: 2031
Tokens: [2, 644, 279, 2768, 11, 1070, 527, 10507, 311, 1893, 2814, 581, 1094, 6067, 304, 1398, 664, 1910, 323, 1101, 3062, 8554, 922, 279, 3834, 627, 2, 28892, 25, 296, 1302, 7399, 1131, 48627, 6880, 3892, 5865, 304, 279, 4092, 311, 1893, 7954, 11, 6302, 11, 24915, 323, 21577, 304, 1398, 664, 1910, 627, 2, 3646, 33776, 1778, 439, 2926, 477, 5905, 10093, 323, 75157, 527, 7231, 439, 220, 18, 35, 11725, 510, 87, 7509, 23500, 60, 369, 10093, 11, 75157, 11, 22666, 198, 2, 4744, 811, 527, 6118, 2728, 439, 12984, 6303, 320, 18, 87, 18, 8760, 1358, 1237, 720, 2, 18947, 3866, 17, 18947, 6828, 2625, 4744, 55, 11, 5868, 56, 11, 5868, 57, 2526, 58303, 264, 12984, 2212, 279, 3728, 865, 7509, 23500, 12984, 8183, 198, 81517, 3318, 449, 33956, 13162, 11, 5296, 430, 1070, 374, 2744, 264, 2254, 16580, 1887, 304, 279, 2547, 11, 720, 2, 8370, 649, 387, 1511, 311, 7124, 279, 3813, 315, 2361, 323, 17140, 315, 35358, 11, 1518, 279, 10507, 627, 1084, 94335, 89255, 46253, 14907, 198, 791

### Prompt Formatting
The following string can also be copied to chatGPT or Google Gemini. 

In [5]:
# Format chat-style prompts or instruction/output pairs
# <|im_start|>system<|im_sep|>{system_prompt}<|im_end|><|im_start|>user<|im_sep|>{prompt}<|im_end|><|im_start|>assistant<|im_sep|>
system_prompt = ''
FormattedStrTask = f"<|im_start|>system<|im_sep|>{system_prompt}<|im_end|><|im_start|>user<|im_sep|>{strTask}<|im_end|><|im_start|>assistant<|im_sep|>"

print(FormattedStrTask.replace('<|im_sep|>', '<|im_sep|>' + '\n').replace('<|im_end|>', '<|im_end|>\n'))

<|im_start|>system<|im_sep|>
<|im_end|>
<|im_start|>user<|im_sep|>
#In the following, there are examples to create multibody systems in Exudyn and also important notes about the interface.
#NOTE: mbs.Create...(...) calls several functions in the background to create nodes, objects, markers and loads in Exudyn.
#most quantities such as initial or reference positions and velocities are giving as 3D lists [x,y,z] for positions, velocities, ....
#rotations are usually given as rotation matrix (3x3 numpy array); 
#RotationVector2RotationMatrix([rotX, rotY, rotZ]) computes a rotation around the global x,y,z rotation axis
#for working with rigid bodies, note that there is always a local coordinate system in the body, 
#which can be used to define the location of position and orientation of joints, see the examples.
    
#%%++++++++++++++++++++++++++++++++++++++++++++++++++++
#create rigid bodies and mass points with distance constraint and joints
import exudyn as exu
from exudyn.utilities imp

### Code Generation
Now, the actual simulation code is created by the language model. This might take a bit.  

In [6]:
t1 = time.time()
print('... start code generation.')
strOutput = model.generate(FormattedStrTask, max_tokens=int(3000))
tokensOut = tokenizer.encode(strOutput, add_special_tokens=False)
dt = time.time() - t1
print('generating {} tokens took {}s  | {} tokens/s'.format(len(tokensOut), dt, round(len(tokensOut)/dt, 3)))
print(strOutput)
# originally takes ~34 seconds on a NVIDIA RTX 4080 GPU. 

... start code generation.
generating 1008 tokens took 65.56406807899475s  | 15.374 tokens/s
To create a 3-link system of rigid bodies using Exudyn, we'll define each link as a rectangular prism (cuboid) aligned along the X-axis. Each link will have mass \(10 \text{ kg}\), length \(2 \text{ m}\), and width/height \(0.1 \text{ m}\). The first link is attached to the ground with a revolute joint at its left end, allowing rotation around the Y-axis.

Here's how you can set up this system in Exudyn:

```python
import exudyn as exu
from exudyn.utilities import *
import numpy as np

# Create a SystemContainer and add a MainSystem to work with
SC = exu.SystemContainer()
mbs = SC.AddSystem()

# Define the dimensions of each link
length = 2.0
width = height = 0.1
mass = 10.0

# Calculate inertia for each cuboid using InertiaCuboid function
inertiaLink = InertiaCuboid(density=mass/(length*width*height), sideLengths=[length, width, height])

# Define a graphics object for visualization of the lin

In [7]:
strCode = strOutput.split("```python")[1].split("```")[0]
CheckOutputLLM(strCode)
exec(strCode)

NameError: name 'graphics' is not defined

If this code does not run or the solver crashes, the model did not create completely correct Python code. 

Another option for inference of the model is to use the llama_cpp library. The library is not in the requirements, as it might be more complicated to set up and is not required. 


In the code below, _streaming_ is used, which slows down the inference significantly. 

In [None]:
try: ## optional using Llama_cpp and streaming: 
    from llama_cpp import Llama
    os.environ["LLAMA_SET_ROWS"] = "1"
    llm = Llama.from_pretrained(
        repo_id=repo_id,
        filename=filename,
        verbose=False, 
        n_ctx=16384,
        n_gpu_layers = -1, 
    )
    t1 = time.time()
    output = llm.create_chat_completion(
            messages=[
                { "role": "system", "content": "You are an engineer and expert in multibody Dynamics." },
                { "role": "user", "content": strTask}
            ], stream=True # show output continuously; this slows the model down 
        )
    full_response  = ""
    for chunk in output:
        delta = chunk['choices'][0]['delta']
        if 'role' in delta:
            print(delta['role'], end=': ')
        elif 'content' in delta:
            print(delta['content'], end='')
            full_response += delta['content']
    dt_llamacpp = time.time() - t1
    
except Exception as e: 
    print(e)
    print('llama_cpp not availible. Try \n\n')


In [None]:
nTokens = len(tokenizer.encode(full_response, add_special_tokens=False))
print('streaming {} tokens too {} seconds | {} tokens/second. '.format(nTokens, round(dt_llamacpp, 1), round(nTokens/dt_llamacpp, 4)))
strCode_llamacpp = full_response.split("```python")[1].split("```")[0]
CheckOutputLLM(strCode_llamacpp)
exec(strCode_llamacpp)