Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloading the dataset and starting in the right directory #528

Open
fa2k opened this issue Mar 2, 2021 · 5 comments
Open

Downloading the dataset and starting in the right directory #528

fa2k opened this issue Mar 2, 2021 · 5 comments
Labels
good first issue Good issue for first-time contributors help wanted Looking for Contributors

Comments

@fa2k
Copy link

fa2k commented Mar 2, 2021

We did the first half of this lesson as an online workshop today. The thing that caused the most problems was to start Jupyter Lab in the terminal / command prompt, and to access the files.

We had added a section on downloading and unzipping the dataset to the installation instructions on the course website (https://uio-carpentry.github.io/2021-03-02-uio-python-online/). I tried to instruct people to go to the desktop during the Running and Quitting episode.

Problems:

  • For some users, the Anaconda Powershell prompt started in C:\>.
  • The desktop was at a network drive for centrally managed PCs. Some people found the data at that path.
  • Others had OneDrive on the desktop, and weren't able to find the data there at all.

Especially in an online workshop, it is hard to deal with these issues. The Setup instructions for this lesson says to unpack in the "root directory", but I think they actually mean the "home directory". But it's not easy for a novice to know what the root or home directory is. Furthermore, if the console starts in C:, they get a permission error when trying to make a new notebook.

It would be great if we could find a bullet-proof procedure to navigate to some directory. This is quite essential when trying to do something useful in Jupyter Lab.

@alee alee added good first issue Good issue for first-time contributors help wanted Looking for Contributors labels Mar 2, 2021
@alee
Copy link
Member

alee commented Mar 2, 2021

Thanks for noting this, I am sure others have experienced the same in their own institutional environments - would you be willing to submit a PR with improved instructions?

@fa2k
Copy link
Author

fa2k commented Mar 3, 2021

After some discussion, we decided to try to make a longer set of setup instructions that includes opening the notebook and checking that the data folder is there. This will be on the course website. The plan is that we will have installation help sessions before the workshop where people can come if it the procedure doesn't work for them in an easy way. I think this set of instructions will go in the workshop template repo, so it may not be a PR to this one. But if I'm able to finish that, I'll try to make some changes here too.

@alee
Copy link
Member

alee commented Aug 16, 2021

The magic command %cd and os.chdir can do the trick, #559 attempts to address this but may require some revision.

@gracieflores
Copy link

We did a shortened version of the lesson today and provided time before we started for anyone who needed help with setup. We asked that everyone have the dataset downloaded, but I think it would have been helpful to have instructions as @fa2k included in their workshop page. We ended up having to pause in the middle of 'Reading Tabular Data into DataFrames' because students were not able to access their data. It probably took 20 minutes to get everyone settled and back on track. If we include more specific details of how to download the data and where to place it, it might be more smooth sailing when getting into that episode. Another thing we can add during 'Running and Quitting' is have a small section to check that the data folder is available. We used tab completion to help many students find their data folder. So maybe once we introduce Jupyter Notebook we can then have the students check that they can access that folder in a cell. I think most of our issues were with OneDrive.

@kaitlinnewson
Copy link

We ran into a similar issue to this for loading the data (specifically with OneDrive). One of our helpers proposed this fix for the user:

# define a variable to store current directory
data_folder = %pwd 
# concatenate with the data sub-folder.
data_folder = data_folder + '/data'
# Now concatenate the csv file name you want to read in. 
data_oceania = pd.read_csv(data_folder + '/gapminder_gdp_oceania.csv')

Happy to make a PR if this would be useful to include somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good issue for first-time contributors help wanted Looking for Contributors
Projects
None yet
Development

No branches or pull requests

4 participants