Skip to content

Facing issue while getting model from Rag,pretrained #36548

@MAHESH18TECH

Description

@MAHESH18TECH

Code

`# Initialize the tokenizer and model
model_name = "facebook/rag-token-nq"
tokenizer = RagTokenizer.from_pretrained(model_name)
model = RagTokenForGeneration.from_pretrained(model_name)

# Initialize the retriever
retriever = RagRetriever.from_pretrained(model_name)

# Tokenization function
def tokenize_function(examples):
    return tokenizer(
        examples['text'], 
        truncation=True, 
        padding='max_length', 
        max_length=512
    )

# Tokenize the dataset
tokenized_dataset = dataset.map(
    tokenize_function, 
    batched=True, 
    remove_columns=dataset.column_names
)`

error

HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 'https://storage.googleapis.com/huggingface-nlp/datasets/wiki_dpr/'. Use repo_type` argument if needed.

The above exception was the direct cause of the following exception:

OSError Traceback (most recent call last)

OSError: Incorrect path_or_model_id: 'https://storage.googleapis.com/huggingface-nlp/datasets/wiki_dpr/'. Please provide either the path to a local folder or the repo_id of a model on the Hub.

During handling of the above exception, another exception occurred:

OSError Traceback (most recent call last)

/usr/local/lib/python3.11/dist-packages/transformers/models/rag/retrieval_rag.py in _resolve_path(self, index_path, filename)
122 f"- or '{index_path}' is the correct path to a directory containing a file named {filename}.\n\n"
123 )
--> 124 raise EnvironmentError(msg)
125 if is_local:
126 logger.info(f"loading file {resolved_archive_file}")

OSError: Can't load 'psgs_w100.tsv.pkl'. Make sure that:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions