Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot download docx files from google drive link #253

Closed
saireddy12 opened this issue Mar 22, 2023 · 1 comment
Closed

Cannot download docx files from google drive link #253

saireddy12 opened this issue Mar 22, 2023 · 1 comment
Labels
bug for issue

Comments

@saireddy12
Copy link

saireddy12 commented Mar 22, 2023

Provide environment information

Python 3.10.10
gdown 4.6.4

What OS are you using?

macOs 12.6.3

Describe the Bug

i used the gdrive.download option using python

url = "https://docs.google.com/document/d/1HOzb__2DdfS1fMDn9_EpoItgeamwRGMD/edit?usp=sharing&ouid=114613300604928585962&rtpof=true&sd=true"
output = '/content/'
gdown.download(url=url, output=output, quiet=False,fuzzy=True)

This is what i am getting as output , its not downloading the file , its just storing the html and the file name is also not correct

From: https://docs.google.com/document/d/1HOzb__2DdfS1fMDn9_EpoItgeamwRGMD/edit?usp=sharing&ouid=114613300604928585962&rtpof=true&sd=true
To: /content/edit?usp=sharing&ouid=114613300604928585962&rtpof=true&sd=true

i see we can download the file properly using file id

url = "https://docs.google.com/document/d/1HOzb__2DdfS1fMDn9_EpoItgeamwRGMD/edit?usp=sharing&ouid=114613300604928585962&rtpof=true&sd=true"
id = "1HOzb__2DdfS1fMDn9_EpoItgeamwRGMD"
output = '/content/'
gdown.download(id=id, output=output, quiet=False,fuzzy=True)

this works fine , so we need to extract the id form the url and use a different function call in case of docs.google.com file link
i see you added code to handle docs.google.com link , but i am not sure why its not working ,anyway

i created a function to extract id from the file(only file) link

def extract_id_from_drive_link( file_link ):
    ### extracts id from a given google drive file link
    #sample file: https://docs.google.com/document/d/1HOzb__2DdfS1fMDn9_EpoItgeamwRGMD/edit?usp=sharing&ouid=114613300604928585962&rtpof=true&sd=true"
    id = -1
    try:
        parsed = urlparse(file_link)
        #check if its a docs link , if yes , return the id
        if parsed.hostname in ["docs.google.com","drive.google.com"]:
            link_path = parsed.path #/document/d/1HOzb__2DdfS1fMDn9_EpoItgeamwRGMD/edit
            id = link_path.split('/')[-2]
        #else return error message and -1
        else:
            print(f"please check the file link , only google drive file link is supported ")
            pass
    except Exception as er:
        print(f"error occured while trying to extract id from drive link , error is: {er}")

    return id

you can pass the file link to above function and call download using id and it works for both docs or drive links

you can do something like this

url = "https://docs.google.com/document/d/1HOzb__2DdfS1fMDn9_EpoItgeamwRGMD/edit?usp=sharing&ouid=114613300604928585962&rtpof=true&sd=true"
id = extract_id_from_drive_link( file_link = url )
gdown.download(id=id, output=output, quiet=False,fuzzy=True)

fyi , i am using gdown 4.6.4

Expected Behavior

No response

To Reproduce

No response

@saireddy12 saireddy12 added the bug for issue label Mar 22, 2023
@wkentaro
Copy link
Owner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug for issue
Projects
None yet
Development

No branches or pull requests

2 participants