Prevent divide by zero in cuda kernel #471

joshpopelka20 · 2024-06-24T15:22:36Z

Fix for bug in issue #395.

I didn't add any error message so feel free to add one. I think it's okay to let it bypass that code when it's zero (prevent divide by zero). I ran llama and there was no issues on my end.

github-actions · 2024-06-24T15:23:38Z

Code Metrics Report

  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                    9           21           21            0            0
 Python                 32         1256         1075           37          144
 TOML                   16          444          403            1           40
-------------------------------------------------------------------------------
 Jupyter Notebooks       1            0            0            0            0
 |- Markdown             1           60           30           22            8
 |- Python               1           96           87            1            8
 (Total)                            156          117           23           16
-------------------------------------------------------------------------------
 Markdown               18         1318            0          980          338
 |- BASH                 5          100           97            0            3
 |- Python               6          122          110            0           12
 |- Rust                 3          151          135            6           10
 (Total)                           1691          342          986          363
-------------------------------------------------------------------------------
 Rust                  119        36404        32889          647         2868
 |- Markdown            59          663           13          613           37
 (Total)                          37067        32902         1260         2905
===============================================================================
 Total                 198        39919        34782         1665         3472
===============================================================================

EricLBuehler

Thanks for adding this. I think that dims[i] should never equal zero, so this may be more of a compiler thing?

joshpopelka20 · 2024-06-24T16:34:33Z

Yeah, Sagemaker may be using an older version of Cuda that causes the issue. Can you bump up the pypi package and I'll verify if that works as well?

EricLBuehler · 2024-06-24T16:38:24Z

Sure, there's a few things I want to merge so I'll probably do it in a few hours.

joshpopelka20 · 2024-06-24T16:45:37Z

No rush. Thanks!

EricLBuehler · 2024-06-24T16:47:18Z

Actually, can you please try to install from source before I put out a Pypi release?

joshpopelka20 · 2024-06-24T17:24:37Z

I cloned the repo and ran this command CUDA_NVCC_FLAGS="-fPIE" maturin build -r --features "cuda flash-attn cudnn" in my notebook instance. It was successful with no errors.

EricLBuehler · 2024-06-24T17:29:50Z

Ok, thanks! I'll publish a new release soon.

EricLBuehler · 2024-06-24T17:43:24Z

@joshpopelka20, do you have a Python script to install Rust, or ideally the full mistralrs installation? I'm looking at adding a Google Colab example, but the only difficulty is installing Rust and the other dependencies. If not, I'll write one, but since you seem to be using the Python API I was wondering if you would have something like this already?

joshpopelka20 · 2024-06-24T17:47:39Z

These are the python commands that I run:

Install Rust

import requests
import subprocess
import os

# Make the HTTPS request
response = requests.get('https://sh.rustup.rs')

# Check if the request was successful
if response.status_code == 200:
    # Run the shell script
    subprocess.run(['sh', '-s', '--', '-y'], input=response.text, text=True)
    env = os.environ.copy()

    # Add the directory containing cargo to the PATH
    env['PATH'] += os.pathsep + os.path.expanduser("$HOME/.cargo/env")
    # Source cargo
    result = subprocess.run(['. "$HOME/.cargo/env"'], shell=True, env=env)
    if result.returncode != 0:
        print("Error occurred:")
        print("Standard Output:", result.stdout.decode())
        print("Standard Error:", result.stderr.decode())
    else:
        print("Success")
        current_path = os.environ.get('PATH', '')

        # Append the Rust bin directory to the PATH
        rust_bin_path = os.path.expanduser("~/.cargo/bin")
        new_path = f"{rust_bin_path}:{current_path}"

        # Update the PATH environment variable
        os.environ['PATH'] = new_path

        !rustc --version
        
else:
    print("Failed to retrieve the script.")

Install Mistral.rs

import os
import subprocess
env = os.environ.copy()
env["CUDA_NVCC_FLAGS"] = "-fPIE"
result = subprocess.run(['pip', 'install', 'mistralrs-cuda'], env=env)

A Colab notebook would be nice. Really easy for beginners to start with.

EricLBuehler · 2024-06-24T19:08:11Z

Thank you for sending that! I was able to put together a small Colab notebook, and I'll probably release one or a few to showcase the various features of mistral.rs.

EricLBuehler · 2024-06-24T19:10:59Z

https://colab.research.google.com/drive/1oAqF2GxtHf0elLYhweryh0fFSSGw__ZM?usp=sharing

joshpopelka20 · 2024-06-24T19:42:40Z

It worked for me.

The only suggestions would be to maybe add some comments about how to get a HF token and requesting access to the Mistral-7B-Instruct-v0.1 model (I was getting a 403 error and I just had to go to the huggingface model page to click the 'request access button').

EricLBuehler · 2024-06-24T20:32:42Z

Thanks for the feedback! I added some markdown comments describing it, so hopefully it should be easier to understand.

prevent divide by zero in cuda kernel

170bcfa

joshpopelka20 mentioned this pull request Jun 24, 2024

Cross GPU device mapping feature #395

Closed

EricLBuehler approved these changes Jun 24, 2024

View reviewed changes

EricLBuehler merged commit 333ce88 into EricLBuehler:master Jun 24, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent divide by zero in cuda kernel #471

Prevent divide by zero in cuda kernel #471

joshpopelka20 commented Jun 24, 2024

github-actions bot commented Jun 24, 2024

EricLBuehler left a comment

joshpopelka20 commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024

joshpopelka20 commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024

joshpopelka20 commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024 •

edited

Loading

joshpopelka20 commented Jun 24, 2024 •

edited

Loading

EricLBuehler commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024

joshpopelka20 commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024

Prevent divide by zero in cuda kernel #471

Prevent divide by zero in cuda kernel #471

Conversation

joshpopelka20 commented Jun 24, 2024

github-actions bot commented Jun 24, 2024

EricLBuehler left a comment

Choose a reason for hiding this comment

joshpopelka20 commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024

joshpopelka20 commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024

joshpopelka20 commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024 • edited Loading

joshpopelka20 commented Jun 24, 2024 • edited Loading

EricLBuehler commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024

joshpopelka20 commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024

EricLBuehler commented Jun 24, 2024 •

edited

Loading

joshpopelka20 commented Jun 24, 2024 •

edited

Loading