-
Notifications
You must be signed in to change notification settings - Fork 736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bc3ec39d breaks the compilation (as noted in #1355) #1359
Comments
Tokenizers cannot be installed for me too. It is being installed as part of the Allen-NLP package and the new version of the Rust compiler breaks it. Installing Rust via the Rust site using their shell script installs 1.73.0 I presume and breaks the Tokenizers compilation, but installing it via Homebrew installs 1.72.1, which is works. |
Which version are you using. This was fixed already on main and https://github.com/huggingface/tokenizers/blob/main/tokenizers/src/models/bpe/trainer.rs#L541-L546 |
To escape from this error, I install transformers with conda, which uses command 'conda install -c huggingface transformers'. then it works. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
I have the same problem with Python 3.11 do you need more information about this issue? |
@DavidAdamczyk Use a more recent tokenizers version, or an older Rust compiler version. |
I use the latest version of tokenizers and the most recent stable version of the Rust compiler. Additionally, I follow the installation instructions available here. Could someone update the installation instructions and include information about the supported versions of all dependencies? |
Hey Hi, Edit: Strategy to solve this error is to use older rust version -> (What I did)
After this It should work properly |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
running into this tonight too. Requirement already satisfied: requests in c:\users\dhorner\anaconda3\envs\hotz\lib\site-packages (from transformers==4.15.0->-r requirements.txt (line 2)) (2.31.0) THE SOLUTION FOR ME WAS TO SET RUSTFLAGS=-A invalid_reference_casting |
Also ran in to this issue last week, installing transformers==4.22.1 pinned by a different project. I also worked around by running: export RUSTFLAGS="-A invalid_reference_casting" ...before installing, but it'd be great if the problem could be tackled at source! |
I would love to be the one to help resolve this further than a environment flag. tokenizers-lib/src/models/bpe/trainer.rs:526 I do not see tokenizers-lib in tree. The error guidance is not clear. GPT says: To resolve this issue, you should use appropriate safe patterns for mutable access, such as Cell, RefCell, or UnsafeCell for interior mutability, depending on your specific use case. In your case, since you're dealing with mutable access to data through raw pointers, you should consider using UnsafeCell. Here's how you can adjust your code: use std::cell::UnsafeCell;
// Assuming Word is some struct or type you're working with
struct Word {
// fields of Word
}
// Assuming words is some collection of Word
let words: Vec<Word> = /* initialization of words */;
// Assuming i is some index into the words vector
let i = /* index */;
// Accessing the word at index i in a mutable way
let w = &words[i] as *const _ as *mut UnsafeCell<Word>;
let word: &UnsafeCell<Word> = unsafe { &*w };
let word_mut: &mut Word = unsafe { &mut *word.get() }; However, using UnsafeCell requires careful handling as it bypasses Rust's safety checks. Make sure you understand the implications of using UnsafeCell and ensure that your code is correct and safe. Alternatively, consider restructuring your code to avoid mutable raw pointer access if possible, as raw pointer manipulation can be error-prone and harder to reason about compared to safe Rust constructs.so Rustonomicon. If someone can orient me to where the code is. I don't know where it lives. |
I'll close this as the latest releases don't have this issue anymore I believe |
As stated, this commit breaks building the tokenizers on modern toolchains, even stable
% rustc -V
rustc 1.73.0 (cc66ad468 2023-10-03)
The text was updated successfully, but these errors were encountered: