Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] RegexMatchError ? #1199

Open
Rizmi opened this issue Jan 19, 2022 · 51 comments
Open

[BUG] RegexMatchError ? #1199

Rizmi opened this issue Jan 19, 2022 · 51 comments
Labels

Comments

@Rizmi
Copy link

Rizmi commented Jan 19, 2022

Until now it's worked perfectly, but now it's showing a issue
is this only for me or ?, can anyone help ty

Ignoring exception in on_message
Traceback (most recent call last):
File "/.local/lib/python3.9/site-packages/pytube/__main__.py", line 177, in fmt_streams
extract.apply_signature(stream_manifest, self.vid_info, self.js)
File "/.local/lib/python3.9/site-packages/pytube/extract.py", line 409, in apply_signature
cipher = Cipher(js=js)
File "/.local/lib/python3.9/site-packages/pytube/cipher.py", line 33, in __init__
raise RegexMatchError(
pytube.exceptions.RegexMatchError: __init__: could not find match for ^\\w+\\W
@Rizmi Rizmi added the bug label Jan 19, 2022
@github-actions
Copy link

Thank you for contributing to PyTube. Please remember to reference Contributing.md

@AlexanderMelnikov
Copy link

AlexanderMelnikov commented Jan 19, 2022

Same.
Simple code:

from pytube import YouTube  
YouTube('https://www.youtube.com/watch?v=dQw4w9WgXcQ').streams.first().download()

Raise error:

pytube.exceptions.RegexMatchError: __init__: could not find match for ^\w+\W

@mishailovic
Copy link

lmao, got the same error and wanted to open an issue. Literally yesterday everything was fine

@TheHamkerCat
Copy link

Prolly some new changes from youtube side

@compsage
Copy link

ditto

@Lolcker2
Copy link

got the same error, is it because of the new version?
(the error: pytube.exceptions.RegexMatchError: init: could not find match for ^\w+\W)

@tylerweston
Copy link

tylerweston commented Jan 19, 2022

As close as I can tell this is being thrown from cipher.py which attempts to thwart YouTubes preventative measures for allowing automated video downloading. If YT wised up and are using a tougher challenge method now this could be a tough nut to crack. Lines 30, 31 in cipher.py

        var_regex = re.compile(r"^\w+\W")
        var_match = var_regex.search(self.transform_plan[0])
        if not var_match:
            raise RegexMatchError(
                caller="__init__", pattern=var_regex.pattern
            )

Looking through the js that the regex is searching is some giant obfuscated mess that I am in no way qualified to deal with so this is where the extent of my help with this issue ends.

@JavDomGom
Copy link

JavDomGom commented Jan 19, 2022

Same. Raise exception from here: https://github.com/pytube/pytube/blob/master/pytube/cipher.py#L31

snapshot_issue_1199

The code for js changed. Reviewing the regular expression is a good start point.

snapshot_issue_1199_2

@juanchosaravia
Copy link

juanchosaravia commented Jan 19, 2022

I just updated the regex like this and it's working:
^\$*\w+\W

image

I'm able to download the videos again.

@tylerweston
Copy link

I just updated the regex like this and it's working: ^\$*\w+\W

image

I'm able to download the videos again.

Beaaauty! I tried but forgot the * between the $ and w! Time to go review my regular expressions! Cheers

JavDomGom added a commit to JavDomGom/pytube that referenced this issue Jan 19, 2022
@JavDomGom
Copy link

I just updated the regex like this and it's working: ^\$*\w+\W

image

I'm able to download the videos again.

Thank you Juancho, I add this fix to PR #1193

@arturtamborski
Copy link

@JavDomGom please add another change to your PR:

# pytube/parser.py, line 152
-    func_regex = re.compile(r"function\([^)]+\)")
+    func_regex = re.compile(r"function\([^)]*\)")

this fixes another exception occurring during audio fetching:

    yt.streams.get_by_itag(140).download(directory, file_name)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/__main__.py", line 292, in streams
    return StreamQuery(self.fmt_streams)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/__main__.py", line 177, in fmt_streams
    extract.apply_signature(stream_manifest, self.vid_info, self.js)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/extract.py", line 409, in apply_signature
    cipher = Cipher(js=js)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/cipher.py", line 44, in __init__
    self.throttling_array = get_throttling_function_array(js)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/cipher.py", line 323, in get_throttling_function_array
    str_array = throttling_array_split(array_raw)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/parser.py", line 158, in throttling_array_split
    match_start, match_end = match.span()
AttributeError: 'NoneType' object has no attribute 'span'

@boardkeystown
Copy link

HUGE you guys rock. Can't wait for the patch

@ryandouglaskish
Copy link

Same. Raise exception from here: https://github.com/pytube/pytube/blob/master/pytube/cipher.py#L31

snapshot_issue_1199

The code for js changed. Reviewing the regular expression is a good start point.

snapshot_issue_1199_2

What text editor is this @JavDomGom

@tylerweston
Copy link

Same. Raise exception from here: https://github.com/pytube/pytube/blob/master/pytube/cipher.py#L31
snapshot_issue_1199
The code for js changed. Reviewing the regular expression is a good start point.
snapshot_issue_1199_2

What text editor is this @JavDomGom

top screenshot at least looks like PyCharm

glubsy added a commit to glubsy/livestream_saver that referenced this issue Jan 20, 2022
@JavDomGom
Copy link

JavDomGom commented Jan 20, 2022

@JavDomGom please add another change to your PR:

# pytube/parser.py, line 152
-    func_regex = re.compile(r"function\([^)]+\)")
+    func_regex = re.compile(r"function\([^)]*\)")

this fixes another exception occurring during audio fetching:

    yt.streams.get_by_itag(140).download(directory, file_name)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/__main__.py", line 292, in streams
    return StreamQuery(self.fmt_streams)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/__main__.py", line 177, in fmt_streams
    extract.apply_signature(stream_manifest, self.vid_info, self.js)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/extract.py", line 409, in apply_signature
    cipher = Cipher(js=js)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/cipher.py", line 44, in __init__
    self.throttling_array = get_throttling_function_array(js)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/cipher.py", line 323, in get_throttling_function_array
    str_array = throttling_array_split(array_raw)
  File "/home/artur/.venv/lib/python3.8/site-packages/pytube/parser.py", line 158, in throttling_array_split
    match_start, match_end = match.span()
AttributeError: 'NoneType' object has no attribute 'span'

Hi @arturtamborski, this is already solved in the current version:
https://github.com/JavDomGom/pytube/blob/master/pytube/parser.py#L152

@JavDomGom
Copy link

HUGE you guys rock. Can't wait for the patch

Yes, there are many PR to review and merge. As soon as the original code is up to date we should all update as well. Temporarily, whoever wants can use my fork with these problems already solved and if you prefer you can also apply the changes to your local version.

snapshot_issue_1199_3

@JavDomGom
Copy link

JavDomGom commented Jan 20, 2022

What text editor is this @JavDomGom

PyCharm 2021.3.1 (Professional Edition), IMHO it is the best Python IDE, I use it like a sir. ;D

@ror1212
Copy link

ror1212 commented Jan 20, 2022

Update the library with the changes you made and then later we solve the problem of the kit... We want a solution All telegram bots and Windows programs are stopped .. Update the library with what you have solved

@rmerzouki
Copy link

I just updated the regex like this and it's working: ^\$*\w+\W

image

I'm able to download the videos again.

Sure it works perfectly, I tested it locally, but what if the app is deployed on Streamlit Cloud/sharing, Heroku ... ?

@JavierDominguezGomez
Copy link

Sure it works perfectly, I tested it locally, but what if the app is deployed on Streamlit Cloud/sharing, Heroku ... ?

#1199 (comment)

@rishabh3354
Copy link

Please Anyone share the branch in which it is patched

@enragedsweat
Copy link

Some guy showed this image. Which file is that, and where can i access it
image

@mishailovic
Copy link

Some guy showed this image. Which file is that, and where can i access it image

var_regex = re.compile(r"^\w+\W")

@abanand132
Copy link

Hii,
this issue hasn't solved yet. I downloaded this module in pycharm and this "var_regex = re.compile(r"^$*\w+\W")" is not there in cipher.py file. I have to manually figure it out.
In the case, If I send my code to others then it will through an error in their system. Suggest a way to figure it out ?

@brilliant-ember
Copy link

brilliant-ember commented Jun 5, 2022

I wrote the code to fix this bug in this PR #1336 I tested it locally and it worked for me.
Until this PR gets merged you can manually edit the cipher file on line 30 and change the regular exp as mentioned in this stack overflow answer https://stackoverflow.com/a/70777385

@NathanDai5287
Copy link

After I changed this line, the error changed to AttributeError: 'NoneType' object has no attribute 'span'

sean-schaefer added a commit to Fireline-Science/pytube that referenced this issue Jun 7, 2022
@ChrisZhangJin
Copy link

May i have a query that when it will have a formal release or update for this fix?

@jaybhavsaar
Copy link

Hello Programmers,
I am facing this issue in this Django code. can you help me what's my mistakes.

Code
from django.shortcuts import render
from pytube import YouTube

def index(request):
link = str(request.GET)
print(link)
youtube_1 = YouTube(link)
videos = youtube_1.streams.filter(progressive=True, file_extension='mp4', res="720p")
videos.download()
print("Downloading...")
print("Successfully")
return render(request, "index.html")

Error
In HTML Page
regex_search: could not find match for (?:v=|/)([0-9A-Za-z_-]{11}).*

In Terminal
pytube.exceptions.RegexMatchError: regex_search: could not find match for (?:v=|/)([0-9A-Za-z_-]{11}).*

joejztang pushed a commit to joejztang/pytube that referenced this issue Aug 11, 2023
Signed-off-by: joel tang <jztangw@gmail.com>
isaku-dev added a commit to isaku-dev/pytube that referenced this issue Aug 15, 2023
@igorsantos314
Copy link

I just updated the regex like this and it's working: ^\$*\w+\W

image

I'm able to download the videos again.

This solution work for me

@medhsv
Copy link

medhsv commented May 7, 2024

i did change line 30 manually, still not fixed.

pytube version 15

yesterday everything was working fine, but today suddenly this issue popped.

Please provide another fix

@medhsv
Copy link

medhsv commented May 7, 2024

I just updated the regex like this and it's working: ^\$*\w+\W
image
I'm able to download the videos again.

This solution work for me

I use Google Colab, made the change in the file.
i re run the cell
not fixed.

if am doing anything wrong please correct me

@igorsantos314
Copy link

I just updated the regex like this and it's working: ^\$*\w+\W
image
I'm able to download the videos again.

This solution work for me

I use Google Colab, made the change in the file. i re run the cell not fixed.

if am doing anything wrong please correct me

While the official solution has not been released, I made the correction on the line where the error occurs. Which is what you did, I hope the solution comes out as soon as possible.

I need to generate a docker image and download the updated dependencies for the generated image, without the official fix we have a problem when running the docker build because with the current version 15.0 the Regex problem will still persist, so to get around this situation I will have to include the venv from my machine, but this is not a good practice.

@Scylla2020
Copy link

Scylla2020 commented May 8, 2024

HUGE you guys rock. Can't wait for the patch

Yes, there are many PR to review and merge. As soon as the original code is up to date we should all update as well. Temporarily, whoever wants can use my fork with these problems already solved and if you prefer you can also apply the changes to your local version.

snapshot_issue_1199_3

Unfortunately the fork now throws an HTTP Error 400: Bad Request. I've installed the latest pytube and edited cipher.py to get everything working. Anyone interested can install the latest working version via pip install git+https://github.com/Scylla2020/pytube.git

itsskanha added a commit to itsskanha/pytube that referenced this issue May 8, 2024
There was a bug in this file and it was encountered by many.  As seen on pytube#1199 , the issue has to be resolved by updating the line 30 to var_regex = re.compile(r"^\$*\w+\W")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests