New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix is_online()
for faster offline imports
#9544
Conversation
CLA Assistant Lite bot All Contributors have signed the CLA. β |
I have read the CLA Document and I sign the CLA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
π Hello @khoalu, thank you for submitting an Ultralytics YOLOv8 π PR! To allow your work to be integrated as seamlessly as possible, we advise you to:
- β
Verify your PR is up-to-date with
ultralytics/ultralytics
main
branch. If your PR is behind you can update your code by clicking the 'Update branch' button or by runninggit pull
andgit merge main
locally. - β Verify all YOLOv8 Continuous Integration (CI) checks are passing.
- β Update YOLOv8 Docs for any new or updated features.
- β Reduce changes to the absolute minimum required for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." β Bruce Lee
See our Contributing Guide for details and let us know if you have any questions!
Codecov ReportAll modified and coverable lines are covered by tests β
Additional details and impacted files@@ Coverage Diff @@
## main #9544 +/- ##
==========================================
+ Coverage 76.12% 76.14% +0.01%
==========================================
Files 121 121
Lines 15281 15278 -3
==========================================
Hits 11633 11633
+ Misses 3648 3645 -3
Flags with carried forward coverage won't be shown. Click here to find out more. β View full report in Codecov by Sentry. |
@khoalu thanks for the PR, but I did a serious amount of optimization before introducing this function into the codebase. The execution time in an offline environment (in this case my M2 Macbook Air with Wifi disabled) is 43 microseconds, i.e. 4% of a millisecond. |
@khoalu hey I had one idea though. You're right that initial import is too slow, but this is mostly because we import so many packages. I've done some work eliminating default package imports, so i.e. pandas and seaborn now are completely eliminated from initial imports. Maybe you could check the rest of the package import speeds and see if there's opportunity to scope the slowest imports across the repo to improve the ultralytics import speed. What do you think? |
I use ultralytics behind corporate proxy. I dont really know the details about the proxy being used. When i set ONLINE = False without running through the Using my PR, i benchmark like this:
Some other packages for reference:
If my PR won't be merged, I wonder if there is some possible way to monkey patch that line UPDATE:
The result:
Packages for references
That's a great idea. I will check the package, and if possible, open another PR for this. |
@khoalu oh really strange. Ok let me think about this a bit. I also profiled all the imported packages, torchvision in particular seems very slow, maybe I can scope it around the repo: PIL: 0.0003 seconds
yaml: 0.0082 seconds
tqdm: 0.0093 seconds
psutil: 0.0099 seconds
cpuinfo: 0.0137 seconds
requests: 0.0414 seconds
scipy: 0.0693 seconds
matplotlib: 0.1319 seconds
cv2: 0.1403 seconds
pandas: 0.2956 seconds # already scoped
torch: 0.6030 seconds
thop: 0.6066 seconds # imports torch
seaborn: 0.7776 seconds # already scoped
torchvision: 0.9394 seconds # <-- slowest by far |
my result to match yours: Im using python 3.10.13 torch==2.1.0+cpu, torchvision==0.16.0+cpu
script:
|
I get this with your script (M2 Macbook Air)
|
@khoalu that's really strange. Are there modifications you can make directly to the is_online() function to allow it to run faster in your environment? |
I'm working to see if I can scope torchvision imports |
I'll try and reply back later |
good topic |
@khoalu does this function work faster, switching the port from 53 to 80, which is more standard? EDIT: I've updated the PR with this change, which should improve performance behind firewalls. def is_online() -> bool:
"""
Check internet connectivity by attempting to connect to a known online host.
Returns:
(bool): True if connection is successful, False otherwise.
"""
import socket
for host in ("1.1.1.1", "8.8.8.8"): # Cloudflare, Google
try:
test_connection = socket.create_connection(address=(host, 80), timeout=1.5)
except (socket.timeout, socket.gaierror, OSError):
continue
else:
# If the connection was successful, close it to avoid a ResourceWarning
test_connection.close()
return True
return False
%timeit is_online() |
is_online()
for faster offline imports
Still not really work in my case :( Using my script above:
It decrease 3 seconds (= 1.5s timeout x 2 hosts I guess ?) |
@khoalu oh got it. I think an HTTP request may provide a more standardized format better suited for firewall environments. Can you test this function for speed in your offline environment? import requests
def is_online() -> bool:
"""
Check internet connectivity by making an HTTP GET request to a known online host.
Returns:
bool: True if the HTTP request is successful, False otherwise.
"""
try:
# Use httpbin.org for a simple status check. HTTPS is used for compatibility with strict firewalls.
response = requests.get("https://httpbin.org/get", timeout=1.5)
# Check if the HTTP status code is 200 (OK), indicating successful internet connectivity.
if response.status_code == 200:
return True
except requests.RequestException:
return False
return False |
The above is_online() takes 9 seconds in my environment
The time it takes seems to scale linearly. I try to change the url to https://example.com, keep the timeout=1.5 and it now takes 1.5 seconds |
@khoalu ok got it. I give up then, I'll add the proposed |
@khoalu ok buddy, new function is faster (and shorter), and also incorporates your YOLO_OFFLINE ENV variable. Can you check if this works for you? import socket
import contextlib
def is_online() -> bool:
"""
Check internet connectivity by attempting to connect to a known online host.
Returns:
(bool): True if connection is successful, False otherwise.
"""
with contextlib.suppress(Exception):
assert str(os.getenv("YOLO_OFFLINE", "")).lower() != "true" # check if ENV var YOLO_OFFLINE="True"
import socket
socket.create_connection(address=("1.1.1.1", 80), timeout=1.0).close() # check Cloudflare DNS
return True
return False |
@khoalu PR merged! Thank you for your contributions. Offline import should now be much faster (no slower than 1s), and you can now use Let me know if this works for you and if you spot any other areas for improvement! |
Sorry for the late response. Thank you very much, it works for me now. Now importing ultralytics only takes 1.5 seconds with |
@khoalu i'm glad to hear that the update works well for you! π If you have any more questions or run into any issues, feel free to reach out. Happy coding! |
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com> Co-authored-by: UltralyticsAssistant <web@ultralytics.com>
If the user knows that this packages is used in an offline environments, dont waste time to check for online status when importing by setting the environment variable "YOLO_OFFLINE=true"
I have read the CLA Document and I sign the CLA
π οΈ PR Summary
Made with β€οΈ by Ultralytics Actions
π Summary
Improved Internet connectivity check method with environmental override.
π Key Changes
YOLO_OFFLINE
).π― Purpose & Impact