Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Not Found Error #26

Closed
jatinarora15 opened this issue Feb 26, 2024 · 2 comments
Closed

File Not Found Error #26

jatinarora15 opened this issue Feb 26, 2024 · 2 comments

Comments

@jatinarora15
Copy link

While running run_pipeline.py I'm encountering the below error:

Traceback (most recent call last):
  File "run_pipeline.py", line 66, in <module>
    run_RG1_and_oracle_method(CONSTANTS.api_benchmark, repos, window_sizes, slice_sizes)
  File "run_pipeline.py", line 28, in run_RG1_and_oracle_method
    CodeSearchWrapper('one-gram', benchmark, repos, window_sizes, slice_sizes).search_baseline_and_ground()
  File "/local/CodeT/RepoCoder/search_code.py", line 119, in search_baseline_and_ground
    self._run_parallel(query_line_path_temp)
  File "/local/CodeT/RepoCoder/search_code.py", line 105, in _run_parallel
    repo_embedding_lines = Tools.load_pickle(repo_embedding_path)
  File "/local/CodeT/RepoCoder/utils.py", line 118, in load_pickle
    with open(fname, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'cache/vector/repos/huggingface_diffusers_ws20_slice2.one-gram.pkl'

Additional Info:

huggingface_diffusers_ws20_slice2.one-gram.pkl file is created in cache/vector/random_api directory however, there is nothing in cache/vector/repo
Also, there is huggingface_diffusers_ws20_slice2.pkl file in cache/window/repo/

Can someone please help with this issue.

@zfj1998
Copy link
Collaborator

zfj1998 commented Apr 8, 2024

cache/vector/repo is supported to contain embedding vectors of repo windows. In run_pipeline.py, one has to manually call the

def vectorize_repo_windows(self):
before starting the search process.

@zfj1998 zfj1998 closed this as completed Apr 8, 2024
@zfj1998
Copy link
Collaborator

zfj1998 commented May 3, 2024

The cache files under "vector/repos/" are vectorized code fragments built from the code files within a repo. During code search, the vectorized code fragments are used for similarity comparison. So before running RG1 or repocoder method, we need to vectorize the code windows produced from each repo by calling the vectorize_repo_windows() function. I have updated the code in this commit (6a6ef63). Sorry for the inconvenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants