Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed Python Notebook Examples #226

Merged
merged 19 commits into from May 1, 2024
Merged

Conversation

MarkFischinger
Copy link
Contributor

I'm excited to share some updates I've made to the Python notebook examples. While browsing through the existing examples because of #224, I noticed a few issues that had to be fixed.
Along with correcting typos, updating for the latest mlpack API changes as well as some other bug fixes, I've ensured all python notebooks are now fully functional by running and fixing each one :)

Once I have some more time, I plan to extend these fixes to the notebooks in other languages too.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

Binder 👈 Launch a binder notebook on branch MarkFischinger/examples/fix/examples

Copy link

review-notebook-app bot commented Apr 11, 2024

View / edit / reply to this conversation on ReviewNB

rcurtin commented on 2024-04-11T13:36:09Z
----------------------------------------------------------------

It seems like meanDates isn't used anywhere else (correct me if I'm wrong), maybe we can just remove this cell?


@@ -1,175 +1,552 @@
{
Copy link
Member

@rcurtin rcurtin Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you had any issues, I can put the CSV also on datasets.mlpack.org. Looks like maybe there was an SSL failure? But everything seems to load correctly for me.


Reply via ReviewNB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it up on https://datasets.mlpack.org/breast-cancer-wisconsin.csv, which shouldn't have an SSL issue, want to update the link to there?

@@ -0,0 +1,506 @@
{
Copy link
Member

@rcurtin rcurtin Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to add this notebook? It looks new, I guess an adaptation of the C++ version?


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcurtin yes, I thought it would be a great addition since most of the notebook examples have a C++ and Python version.

Copy link

review-notebook-app bot commented Apr 11, 2024

View / edit / reply to this conversation on ReviewNB

rcurtin commented on 2024-04-11T13:36:13Z
----------------------------------------------------------------

I just saw Whiplash a few weeks ago (number 24 on the list), I didn't realize it was so old. I thought it was a great movie.


MarkFischinger commented on 2024-04-11T15:03:55Z
----------------------------------------------------------------

Haven't seen it yet, but I’ll definitely check it out on your recommendation when I have the chance ;)

Copy link

review-notebook-app bot commented Apr 11, 2024

View / edit / reply to this conversation on ReviewNB

rcurtin commented on 2024-04-11T13:36:14Z
----------------------------------------------------------------

Nice catch!


Copy link

review-notebook-app bot commented Apr 11, 2024

View / edit / reply to this conversation on ReviewNB

rcurtin commented on 2024-04-11T13:36:16Z
----------------------------------------------------------------

I wonder if it would be possible to use mlpack.preprocess_split() and mlpack.preprocess_scale() here instead, but, don't feel obligated to look into it, it's just a passing thought.


Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is absolutely great! Thank you for taking the time to go through all of these @MarkFischinger. I wish that we had an automated way to keep things up to date, but I think we don't have any CI job that tests the notebooks. All of my comments are pretty minor, really the main one is about the SSL verification---I think it shouldn't be needed, was there a certificate issue?

@MarkFischinger
Copy link
Contributor Author

@rcurtin Yes, I encountered a slight issue with SSL verification while requesting lab.mlpack.org. The error I ran into was: URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)>.

My current approach is a workaround to this problem. However, I agree that the ideal solution would be to update the SSL certificate on the website.

Copy link
Contributor Author

Haven't seen it yet, but I’ll definitely check it out on your recommendation when I have the chance ;)


View entire conversation on ReviewNB

@@ -1,225 +1,666 @@
{
Copy link
Member

@rcurtin rcurtin Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #2.    !wget --no-check-certificate -O - https://lab.mlpack.org/data/cifar10-images.tar.gz | tar -xz

This one can also be changed to https://datasets.mlpack.org/cifar10-images.tar.gz (actually it was already located there too).


Reply via ReviewNB

@@ -1,146 +1,171 @@
{
Copy link
Member

@rcurtin rcurtin Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #3.    df = pd.read_csv('https://lab.mlpack.org/data/covertype-small.csv.gz')

Here you can use https://datasets.mlpack.org/covertype-small.csv.gz


Reply via ReviewNB

Copy link

review-notebook-app bot commented Apr 22, 2024

View / edit / reply to this conversation on ReviewNB

rcurtin commented on 2024-04-22T20:40:44Z
----------------------------------------------------------------

This is also on datasets.mlpack.org: https://datasets.mlpack.org/MovieLens-small.zip


@rcurtin
Copy link
Member

rcurtin commented Apr 22, 2024

I went through and added the datasets to datasets.mlpack.org, so now we can use that instead to avoid the SSL error. Can you make those changes? If so I think this is good to go. 👍

@MarkFischinger
Copy link
Contributor Author

@rcurtin I removed all SSL certificate exceptions, tested it, and everything seems to be working fine now :)

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks again! 👍

Copy link

@mlpack-bot mlpack-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second approval provided automatically after 24 hours. 👍

@rcurtin rcurtin merged commit 5b2062c into mlpack:master May 1, 2024
4 checks passed
@rcurtin
Copy link
Member

rcurtin commented May 1, 2024

Thanks again @MarkFischinger! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants