Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent divide by zero in cuda kernel #471

Merged
merged 1 commit into from
Jun 24, 2024

Conversation

joshpopelka20
Copy link
Contributor

Fix for bug in issue #395.

I didn't add any error message so feel free to add one. I think it's okay to let it bypass that code when it's zero (prevent divide by zero). I ran llama and there was no issues on my end.

Copy link

Code Metrics Report
  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                    9           21           21            0            0
 Python                 32         1256         1075           37          144
 TOML                   16          444          403            1           40
-------------------------------------------------------------------------------
 Jupyter Notebooks       1            0            0            0            0
 |- Markdown             1           60           30           22            8
 |- Python               1           96           87            1            8
 (Total)                            156          117           23           16
-------------------------------------------------------------------------------
 Markdown               18         1318            0          980          338
 |- BASH                 5          100           97            0            3
 |- Python               6          122          110            0           12
 |- Rust                 3          151          135            6           10
 (Total)                           1691          342          986          363
-------------------------------------------------------------------------------
 Rust                  119        36404        32889          647         2868
 |- Markdown            59          663           13          613           37
 (Total)                          37067        32902         1260         2905
===============================================================================
 Total                 198        39919        34782         1665         3472
===============================================================================
  

Copy link
Owner

@EricLBuehler EricLBuehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this. I think that dims[i] should never equal zero, so this may be more of a compiler thing?

@EricLBuehler EricLBuehler merged commit 333ce88 into EricLBuehler:master Jun 24, 2024
10 checks passed
@joshpopelka20
Copy link
Contributor Author

Yeah, Sagemaker may be using an older version of Cuda that causes the issue. Can you bump up the pypi package and I'll verify if that works as well?

@EricLBuehler
Copy link
Owner

Sure, there's a few things I want to merge so I'll probably do it in a few hours.

@joshpopelka20
Copy link
Contributor Author

No rush. Thanks!

@EricLBuehler
Copy link
Owner

Actually, can you please try to install from source before I put out a Pypi release?

@joshpopelka20
Copy link
Contributor Author

I cloned the repo and ran this command CUDA_NVCC_FLAGS="-fPIE" maturin build -r --features "cuda flash-attn cudnn" in my notebook instance. It was successful with no errors.

@EricLBuehler
Copy link
Owner

Ok, thanks! I'll publish a new release soon.

@EricLBuehler
Copy link
Owner

EricLBuehler commented Jun 24, 2024

@joshpopelka20, do you have a Python script to install Rust, or ideally the full mistralrs installation? I'm looking at adding a Google Colab example, but the only difficulty is installing Rust and the other dependencies. If not, I'll write one, but since you seem to be using the Python API I was wondering if you would have something like this already?

@joshpopelka20
Copy link
Contributor Author

joshpopelka20 commented Jun 24, 2024

These are the python commands that I run:

  1. Install Rust
import requests
import subprocess
import os

# Make the HTTPS request
response = requests.get('https://sh.rustup.rs')

# Check if the request was successful
if response.status_code == 200:
    # Run the shell script
    subprocess.run(['sh', '-s', '--', '-y'], input=response.text, text=True)
    env = os.environ.copy()

    # Add the directory containing cargo to the PATH
    env['PATH'] += os.pathsep + os.path.expanduser("$HOME/.cargo/env")
    # Source cargo
    result = subprocess.run(['. "$HOME/.cargo/env"'], shell=True, env=env)
    if result.returncode != 0:
        print("Error occurred:")
        print("Standard Output:", result.stdout.decode())
        print("Standard Error:", result.stderr.decode())
    else:
        print("Success")
        current_path = os.environ.get('PATH', '')

        # Append the Rust bin directory to the PATH
        rust_bin_path = os.path.expanduser("~/.cargo/bin")
        new_path = f"{rust_bin_path}:{current_path}"

        # Update the PATH environment variable
        os.environ['PATH'] = new_path

        !rustc --version
        
else:
    print("Failed to retrieve the script.")
  1. Install Mistral.rs
import os
import subprocess
env = os.environ.copy()
env["CUDA_NVCC_FLAGS"] = "-fPIE"
result = subprocess.run(['pip', 'install', 'mistralrs-cuda'], env=env)

A Colab notebook would be nice. Really easy for beginners to start with.

@EricLBuehler
Copy link
Owner

Thank you for sending that! I was able to put together a small Colab notebook, and I'll probably release one or a few to showcase the various features of mistral.rs.

@EricLBuehler
Copy link
Owner

@joshpopelka20
Copy link
Contributor Author

It worked for me.

The only suggestions would be to maybe add some comments about how to get a HF token and requesting access to the Mistral-7B-Instruct-v0.1 model (I was getting a 403 error and I just had to go to the huggingface model page to click the 'request access button').

@EricLBuehler
Copy link
Owner

Thanks for the feedback! I added some markdown comments describing it, so hopefully it should be easier to understand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants