You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am working with a script for reading and vectorizing data from github.
This is is the code in the script (it worked with older, <0.10 versions of llama-index)
while True:
owner, repo = parse_github_url(github_url)
if validate_owner_repo(owner, repo):
loader = GithubRepositoryReader(
github_client,
owner=owner,
repo=repo,
filter_file_extensions=(
[".py", ".js", ".ts", ".md"],
GithubRepositoryReader.FilterType.INCLUDE,
),
verbose=False,
concurrent_requests=5,
)
print(f"Loading {repo} repository by {owner}")
docs = loader.load_data(branch="main")
print("Documents uploaded:")
for doc in docs:
print(doc.metadata)
break # Exit the loop once the valid URL is processed
else:
print("Invalid GitHub URL. Please try again.")
github_url = input("Please enter the GitHub repository URL: ")
having fixed my imports to be compatible with >=0.10.5, running the above code generates the following error
Traceback (most recent call last):
File "C:\Users\aaols\PycharmProjects\experiments\llamaindex_activeloop_vectorize_data_from_github.py", line 123, in <module>
main()
File "C:\Users\aaols\PycharmProjects\experiments\llamaindex_activeloop_vectorize_data_from_github.py", line 73, in main
GithubRepositoryReader.FilterType.INCLUDE,
AttributeError: type object 'GithubRepositoryReader' has no attribute 'FilterType'
Looking at the code for the latest version of GithubRepositoryReader
It looks like the filter_file_extensions arg is no longer supported, nor is GithubRepositoryReader.FilterType
Is this intentional? This reader no longer supports filtering by specified allowlist and instead only supports filtering via a denylist?
In this case, specifying everything I don't want is much bigger task than just specifying the few things that I do want.
If that is not the case, what needs to be updated in order to support the explicit allowlist?
Version
0.10.6
Steps to Reproduce
try calling GithubRepositoryReader.FilterType.INCLUDE with the latest version of llama-index-readers-github
Relevant Logs/Tracbacks
No response
The text was updated successfully, but these errors were encountered:
Thank you for bringing this to our attention. It seems like there has been a change in the GithubRepositoryReader class in the latest version of LlamaIndex. I apologize for the inconvenience this has caused.
To better assist you, could you please provide the following additional information:
Could you confirm if you have updated all your dependencies to their latest versions along with LlamaIndex?
Could you share the exact version of the llama-index-readers-github package you are using?
Are there any other parts of your code that interact with the GithubRepositoryReader class or its instances?
This information will help us understand the issue more clearly and provide a more accurate solution. Thank you for your cooperation.
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
Bug Description
I am working with a script for reading and vectorizing data from github.
This is is the code in the script (it worked with older, <0.10 versions of llama-index)
having fixed my imports to be compatible with >=0.10.5, running the above code generates the following error
Looking at the code for the latest version of GithubRepositoryReader
It looks like the
filter_file_extensions
arg is no longer supported, nor isGithubRepositoryReader.FilterType
Is this intentional? This reader no longer supports filtering by specified allowlist and instead only supports filtering via a denylist?
In this case, specifying everything I don't want is much bigger task than just specifying the few things that I do want.
If that is not the case, what needs to be updated in order to support the explicit allowlist?
Version
0.10.6
Steps to Reproduce
try calling GithubRepositoryReader.FilterType.INCLUDE with the latest version of llama-index-readers-github
Relevant Logs/Tracbacks
No response
The text was updated successfully, but these errors were encountered: