-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cardsort: analyzing data from open card sorting tasks #102
Comments
Hey @katoss, I am Alex, and I will be acting as the editor-in-chief for this submission. I am looking forward to working with you 👋 |
Editor in Chief checksHi there! Thank you for submitting your package for pyOpenSci Please check our Python packaging guide for more information on the elements
Editor comments
You could increase the pool of potential users by loosening the constraints. Based on what I saw in the sources, we do not need the latest versions for these four packages.
|
Thank you @Batalex ! I will try to address your comments in the course of the week. :) |
Hi @Batalex,
Normally semantic release should bump the version automatically, but I keep running into issues. It only seems to work via getting the release version from the GitHub tag. I set the version to 0.0.0 for now, as you suggested. I will try to find a better solution, but for now at least the version is correctly updated on github and pypi.
I am sorry if this is a stupid question, but how do I know which are the minimum versions of packages I can allow without breaking my code? I tried using
True, thank you. Fixed it.
I wasn't aware that this was possible. Thank you, fixed it! Best, |
This is an excellent question! And a truly difficult one to answer in Python, given the dynamic nature of the language. Maybe someday, a tool will be able to parse the source code and match it against the dependencies changelogs, but I am dreaming 🫠 If the dependencies were perfectly following the semver convention, you could use the latest major releases for each dependency. This is extreme in a practical setting, so I would not advise going so far. You are using but a few basic numpy functions, so you should be could fine with the same constraint as pandas.
Maybe we could go even lower with pandas, but this change is enough IMO. |
Thank you for the explanations! So it is really not that easy. For now, I reduced the constraints to the versions you indicated. |
Hi @Batalex @katoss just adding one more possibly useful piece of info: Unfortunately it does not say what is the minimum that you should support 🤷 but I basically go with the minimum for whatever Python is still supported -- right now that's Python 3.9 -> numpy 1.21.0, etc. ... It looks like you ended up with roughly those versions as your lower bounds anyways. Not to overwhelm you with info but if you don't know about it already you might want to check out: https://iscinumpy.dev/post/bound-version-constraints/ |
Hey @katoss!, |
👋 Hi @Robaina and @khynder! Thank you for volunteering to review for pyOpenSci! I am thrilled to have you both on board for this review. Feel free to introduce yourselves to our very own submitter @katoss 🤗 @katoss, meet @Robaina, whom I had the pleasure of working with on his submission to pyOpenSci. The following resources will help you complete your review:
Please get in touch with any questions or concerns! Your review is due: on June 5th. |
Hi @katoss, nice to meet you! I'll be one of the reviewers of |
a quick question: according to pyOpenSci docs I understand that I should review |
Hey,
The rationale behind this approach is that we want to ensure the review scope stays more or less the same. No one wants to commit to a review on a fixed scope only to discover that the goalposts moved a few weeks later. Based on what I saw, the most recent releases deal with a few mishaps on the packaging side. Most of the code base remains untouched so this approach should be fine! |
Alright! Thanks for the explanation |
⭐ I have completed the review ⭐ I find Ok, here it goes: CHECKLIST: Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
It is succinct but sufficient. However, I think the repo would benefit from a little more explanation for non-experts. This may be as easy as providing a link to an url explaining why card sorting is useful in UX design.
The instructions are located within the 'CONTRIBUTING.md' file. I would add the link to this file directly in the Contributing section of the README.md file.
The documentation is extensive. Hosted in readthedocs.io. I suggest to provided the url to the docs in the About section of the repo to increase visibility.
I suggest adding the link to the contributing file in the contributing section of the README.md file.
Readme file requirements
The README should include, from top to bottom:
The current name of the package, 'cardsort', is missing as such in the title of the README file. Also, if you like, you could create a simple logo for the package to attract more users (See this repo as an example).
There are no badges in the README. You can read about badges here in case you are not familiar. Also, I can help with this if you need it.
As stated above, I suggest adding the URL to the About section of the repo to increase visibility.
No citation info was provided. I suggest publishing the repo in Zenodo and adding the citation and the DOI to the README.md file. Once published, Zenodo provides a badge containing the DOI and which you can display in the badge section of the README file. You could also add a CITATON.cff file to the repo so users can easily get the citation string. UsabilityReviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
While the purpose of this package is indicated in the README, I think a little more explanation for non-experts would be useful. This may be as easy as providing a link to an url explaining why card sorting is useful in UX design.
Functionality
As long as I can tell, the package works as expected.
For packages also submitting to JOSSNot applicable.
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted. The package contains a
Final approval (post-review)
Estimated hours spent reviewing: 3.5 Review CommentsGeneral comments:
Detailed comments for each function: _get_distance_matrix_for_user
for i in range(n):
for j in range(i, n):
cat1 = ...
cat2 = ...
X[i, j] = X[j, i] = 0 if cat1 == cat2 else 1 Note that you could also write the if - else statement in a single line as shown above.
get_distance_matrix
create_dendrogram
color_treshold = 0.75 if color_treshold is None else color_treshold and similar to the second definition to simplify the nested if - else statements.
plt.xticks(np.arange(0.0, 1.1, 0.1) if x_max <= 1 else np.arange(0, x_max + 1, 1)) _get_cluster_label_for_user
if card in df_u["card_label"].values:
...
else:
print(f'"{card}" is not a valid card label. Removed from list.')
cluster_cards.remove(card)
print("Continue with cards: %s" % cluster_cards) _get_cards_for_label
get_cluster_labels
print("User " + str(id) + " labeled card(s): " + cluster_label) with: print(f"User {id} labeled card(s): {cluster_label}") get_cluster_labels_df
|
Thanks a lot for your review @Robaina ! I had a quick look over your comments, and will try to address them over the weekend :) |
That was fast 🚀 |
Hi @Robaina, thanks a lot again for the thorough review! 🌟 I have started to address your comments, mainly so far by adapting the functions:
Thank you for helping me improve the code quality! Please let me know if I managed to adapt the functions as you imagined. Regarding the rest of the feedback, I will try to address it as soon as possible. |
I started to wonder about a more general question regarding versioning. My ci/cd pipeline publishes a new version with each commit. Like yesterday I went from 0.2.12. to 0.2.20 by making a commit for each change (since I did not want to make the commits too big). Based on your experience @Robaina @Batalex, is this the usual way to go it or is it too much? |
Great! Glad my comments were useful! I'll take a deeper look at them later and come back to you. |
Based on my experience I would say this is too much, I would update the version only after a significant change to the codebase. Of course this "significant" is subjective. Here are some guidelines for semantic versioning. On the other hand, I actually forgot to ask you about your CI / CD pipeline the other day. I observed that currently the package is built and uploaded to PIP on every push to the main branch. I am by no means an expert in DevOps, but I think that if you are directly pushing every commit to main, then building the package and uploading to PIP is perhaps too much. I would do this only after collecting some changes that address a specific issue (or issues). This workflow would be fine if there were another branch, say, dev (for development) meant to collect changes and then the CD workflow would get triggered on merging dev to main. Definitely @Batalex will know more about this. |
Ok, thank you! Then my intuition wasn't wrong. I followed this tutorial to create the package and CI/CD, and did not question the way they did it, since it was the first time for me setting all of this up. |
Hi, It is quite a lot, I listed everything, but have addressed only some issues so far, and have some questions for others. @khynder , could you accept the invitation to become a collaborator on cardsort, so I can assign you as a reviewer to the PR? I also feel there might be still an issue with the trigger for CD, since CD started for all commits for patch changes in the new PR (see Actions). |
Can you try removing the |
Thank you, just changed it in the fix-functions branch, let's see if it changes something. Edit: I guess the change was not thought to impact the CD and was more of a global suggestion, but in any case the CD is still getting triggered. Probably the
does not work. I will try to think about another solution |
I merged the I think I worked through all the feedback now 🤔 What are the next steps? :) |
Congrats @katoss, that was a big one!
For now, I'll ask @Robaina and @khynder to check the final approval checkbox in their review message. Then, I will wrap up the review by checking some other stuff. It's great that you already took care of zenodo and the badge! |
Yes, I did, I think I forgot to write it. Now the CD only triggers when changes are pushed or merged with the main branch. So far it seems to work :)
Ok amazing! |
Checked! |
so sorry for the delay... checked! |
🎉 Author Wrap Up TasksThere are a few things left to do to wrap up this submission:
@Robaina, @khynder we would greatly appreciate if you could fill the post-review survey as well! |
Congrats @katoss! |
Congrats @katoss, I am glad this review went so well. I am truly pleased that we were able to add value to your work. We offer authors to join us on our slack / discourse. Let me know if you are interested in joining the slack. |
Amazing, thanks @Batalex :) And yes, I would like to join the slack. Could you send me an invite? I saw that the PR to update the pyOpenSci website with the |
I just invited you 🐈⬛
It seems that we have an issue with pre-commit.ci, but it should be fixed soon |
hey there @katoss 👋 thank you for mentioning this issue in our website update pr!! i saw it. it took me a while to merge as there were some bugs in the build that i wanted to fix before merging. it would have broken the website! but generally these will get merged more quickly - as that build becomes a bit more stable. cardsort is now listed on our website!! 🚀 and we look forward to seeing you around slack!! thank you everyone for your work on this pr!! and @Batalex for leading the charge on the review ✨ !!! |
Submitting Author: (@katoss)
All current maintainers: (@katoss)
Package Name: cardsort
One-Line Description of Package: A python package to analyse data from open card sorting tasks
Repository Link: https://github.com/katoss/cardsort
Version submitted: 0.2.2
Editor: @Batalex
Reviewer 1: @Robaina
Reviewer 2: @khynder
Archive:
JOSS DOI: N/A
Version accepted: v 0.2.36
Date accepted (month/day/year): 08/16/2023
Code of Conduct & Commitment to Maintain Package
Description
Card sorting is a standard research method in the human computer interaction field to design information architectures. The
cardsort
package analyzes and visualizes data from open card sorting tasks. Using csv data in the format returned by the kardsort tool (or any other tool outputting the same columns), it outputs a dendrogram based on hierarchical cluster analysis and pairwise similarity scores. It can also return category labels to learn which labels study participants gave to combinations of cards from emerging clusters.Scope
Please indicate which category or categories.
Check out our package scope page to learn more about our
scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
Domain Specific & Community Partnerships
Community Partnerships
If your package is associated with an
existing community please check below:
For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
The target audience are researchers or practitioners in human computer interaction and user experience. Open card sorting is a popular user research method used to understand how people order and conceptualize information, in order to design information architectures. In order to make sense of the data, clusters are often visualized in form of a dendrogram. To do so, pairwise similarity scores need to be calculated for all cards, followed by a hierarchical cluster analysis. This functionality is provided by the cardsort package. It also offers functionality to return category labels, in order to learn which labels study participants gave to combinations of cards in the emerging clusters.
I did not find any python packages, nor other open source tools, that accomplish this, which was my motivation to make this package, when I was running a card sorting study in the course of my PhD. While research articles describe the steps of this analysis (i.e. pairwise similarity scores --> hierarchical cluster analysis --> dendrogram, see articles linked in question above), I did not find any open source tools that help to do this analysis. I used the widely used kardsort.com tool to collect the data, which refers to rather dated, closed source Windows software for analysis, which underpins my assumption of a lack of open source tools. The approach is rather simple, making use of scipy for hierarchical cluster analysis and dendrogram, adding a custom function to create the similarity scores, and putting everything in an easy-to-use pipeline. Nevertheless, in doing so, I think the cardsort package can help remove barriers for the application of this user research method (easy to use even for python beginners, no need to use closed source software that only runs on windows or expensive subscription tools). In any case I would have liked to have this package when I started my study :)
@tag
the editor you contacted:#101
Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication Options
JOSS Checks
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Note: Do not submit your package separately to JOSS
Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Confirm each of the following by checking the box.
Please fill out our survey
submission and improve our peer review process. We will also ask our reviewers
and editors to fill this out.
P.S. Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
The editor template can be found here.
The review template can be found here.
Footnotes
Please fill out a pre-submission inquiry before submitting a data visualization package. ↩
The text was updated successfully, but these errors were encountered: