Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback on the new Anthology website #170

Open
mbollmann opened this issue Mar 10, 2019 · 64 comments

Comments

@mbollmann
Copy link
Collaborator

commented Mar 10, 2019

This thread is intended to collect all feedback, suggestions, bug reports, etc. for the new Anthology website in the static-rewrite branch.

(Edit: live demo here at http://aclweb.org/anthology)

If you do not have a GitHub account, you're also welcome to send me feedback via e-mail (marcel@bollmann.me) or Twitter (@mmbollmann)!

Known Issues

  • The search functionality now uses Google Custom Search. We're still finetuning its settings and waiting for some pages to be indexed, so please don't report any weird search behaviour just yet.
  • Author name variations (#86) are an open problem that we plan to address before the site launch.

@mbollmann mbollmann self-assigned this Mar 10, 2019

@mbollmann mbollmann added this to To do in Static Rewrite of the Anthology via automation Mar 10, 2019

@mbollmann mbollmann pinned this issue Mar 10, 2019

@akoehn

This comment has been minimized.

Copy link
Member

commented Mar 11, 2019

I really like it, especially the speed!

There is a display:none span containing the text "bib" in the bibtex block inside the acl-paper-link-block block. When using a text browser, this leads to the text being BibTeXbib. That span should be removed.

As a minor comment: Could you specify the hardware requirements for building the anthology a bit? How much time & memory does building take? "a considerable amount of memory" could be 8GB or 512, depending on whom you ask :-)

@davidweichiang

This comment has been minimized.

Copy link
Collaborator

commented Mar 11, 2019

It looks great! On Safari, when you click on on pdf/bib link and then click the browser's back button, the little callout ("Open PDF" or "Export BibTeX") remains on.

@danielgildea

This comment has been minimized.

Copy link
Collaborator

commented Mar 11, 2019

awesome!!!!!!!!!

@texttheater

This comment has been minimized.

Copy link
Contributor

commented Mar 11, 2019

I think it would look better if the header had the same width as the content. I.e., the ACL logo would move to the left and the search box to the right, in order to align with the content.

@desilinguist

This comment has been minimized.

Copy link
Member

commented Mar 11, 2019

Looks awesome! Great work! 👏

@mjpost

This comment has been minimized.

Copy link
Member

commented Mar 11, 2019

What's the reason for inserting newlines in the bib field values? (for example, in booktitle here, and titles elsewhere).

@stevenbedrick

This comment has been minimized.

Copy link

commented Mar 11, 2019

Disclaimer: This is about search, but is not about weird search behavior as such. Is Google Custom Search the long-term search solution for the new version of the Anthology? It is inherently waaaaay less functional than the existing search system on the current Anthology- for example, the current search page has really great result faceting, etc.

@stevenbedrick

This comment has been minimized.

Copy link

commented Mar 11, 2019

And I just saw #165 - glad to see that something more flexible is on the roadmap/radar. In the meantime, we could also link to the DFKI "ACL Anthology Searchbench".

@aryamccarthy

This comment has been minimized.

Copy link
Collaborator

commented Mar 12, 2019

On mobile, the magnifying glass of the search bar gets forced to the next row for me.

@aryamccarthy

This comment has been minimized.

Copy link
Collaborator

commented Mar 12, 2019

Is the BibTeX generation handling special characters properly?

This entry has weird quotation marks in the abstract. http://aclweb.org/anthology/papers/C/C18/C18-1137.bib
This one has weird things going on in the title field. http://aclweb.org/anthology/papers/K/K18/K18-3001.bib

@danielhers

This comment has been minimized.

Copy link

commented Mar 12, 2019

When there is just one paper in a conference, the noun after the number should be singular "paper" and not "papers".
Example: Proceedings of the Pilot SENSEVAL 1 papers in http://www.aclweb.org/anthology/venues/semeval/

@rahular

This comment has been minimized.

Copy link

commented Mar 12, 2019

Awesome work! One small issue I saw is that when I am browsing through papers in pages like this, there is no way for me to scroll back to the top instantly. The up button which is present at the beginning of the page could be floating around a corner.

@danielgildea

This comment has been minimized.

Copy link
Collaborator

commented Mar 12, 2019

Is the BibTeX generation handling special characters properly?

Fixed by [6bbc5a1]

@davidweichiang

This comment has been minimized.

Copy link
Collaborator

commented Mar 12, 2019

Re: #170 (comment), when I view in Chrome or iOS Safari, I see mojibake, but on macOS Safari, it looks fine.

Although @danielgildea's fix puts the .bib file into ASCII (as it should be), I wonder if, as a failsafe, can the server put Content-Type: application/x-bibtex; charset=utf-8 into the response header?

@danielgildea

This comment has been minimized.

Copy link
Collaborator

commented Mar 12, 2019

What's the reason for inserting newlines in the bib field values? (for example, in booktitle here, and titles elsewhere).

anth2bib.py is just passing through newlines that are in the titles in the xml files.
I can't figure out where they come from originally. Personally, I think they make the bibtex more readable anyway.

anth2bib.py does insert newlines between author names. I think this makes it more readable,
especially when names are in "Last, First" format.

@aryamccarthy

This comment has been minimized.

Copy link
Collaborator

commented Mar 12, 2019

Is the BibTeX generation handling special characters properly?

Fixed by [6bbc5a1]

I'm seeing "CoNLL–SIGMORPHON" in macOS Safari, instead of "CoNLL–SIGMORPHON". Does the build script need to be re-run to show the fix?

@mbollmann

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 12, 2019

Does the build script need to be re-run to show the fix?

Absolutely. Fixes are not reflected on the live website until @mjpost rebuilds it and pushes it there.

@mjpost

This comment has been minimized.

Copy link
Member

commented Mar 12, 2019

I agree the one-line-per-author variant is more readable and is fine with me, as long as we make sure to use spaces and not tabs (per #16).

I'll rebuild soon, by tonight at the latest. Once we have continuous integration checks built (#102) and other checks against commits to the master branch, we can have it automated.

@mbollmann

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 12, 2019

Thanks for all the feedback so far! I've implemented a bunch of minor layout fixes based on the comments here (with the same caveat as above: will not be live until Matt rebuilds).

Disclaimer: This is about search, but is not about weird search behavior as such. Is Google Custom Search the long-term search solution for the new version of the Anthology? It is inherently waaaaay less functional than the existing search system on the current Anthology- for example, the current search page has really great result faceting, etc.

I believe Google Custom Search is much more powerful than people give it credit for, and it offers customization options that should allow for similar result faceting and features as before. However, that requires some more work on my part, and it wasn't really possible to implement and test this earlier as, by its very nature, it requires the new site to be live and getting indexed by Google first.

I'd really like to advocate for some more patience here over the coming weeks as I'm hoping to improve this. Maintaining a custom-made search solution is a huge liability IMO, and I would really like for people to give the Google version a fair chance first.

@stevenbedrick

This comment has been minimized.

Copy link

commented Mar 12, 2019

@mbollmann That's totally fair, and thank you for the reply. I certainly see the value of using an off-the-shelf/hosted search platform in general, and also of using Google Custom Search in particular as a "getting things up and running" solution. For the sake of clarity, my concerns are less about the search behavior of GCS- if anybody can build a decent text search engine, it'd be Google! My concerns are more about search UI/UX- result faceting, etc. I'm happy to give GCS more of a chance, and am looking forward to seeing what we're able to do with GCS in terms of customization. Thank you (all of you!) for your efforts on this project; I do very much like the redesign overall and am excited to see it evolve!

@mjpost

This comment has been minimized.

Copy link
Member

commented Mar 12, 2019

Okay, rebuilt. I also merged in master which had some corrections.

@aryamccarthy

This comment has been minimized.

Copy link
Collaborator

commented Mar 13, 2019

Unclear whether this is a parsing error or a data error: this BibTeX has no article title.

@mjpost

This comment has been minimized.

Copy link
Member

commented Mar 13, 2019

Thanks! The title appears in the HTML: (http://aclweb.org/anthology/D13-1088/) and is in the XML, so I'm not sure what's going on here.

@mbollmann

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 13, 2019

Thanks! The title appears in the HTML: (http://aclweb.org/anthology/D13-1088/) and is in the XML, so I'm not sure what's going on here.

Pretty sure it's related somehow to the title starting with <fixed-case>. It's fixed with the refactored BibTeX generation in 7cd20c3.

@mjpost

This comment has been minimized.

Copy link
Member

commented Mar 13, 2019

Ah, I was looking at the master branch. i’ll rebuild tonight.

@mjpost

This comment has been minimized.

Copy link
Member

commented Mar 13, 2019

Done, and the problem is indeed fixed. Thanks!

@seeledu

This comment has been minimized.

Copy link

commented Apr 9, 2019

I‘m a Chinese user, I can't use the Search Results.

and when i use the VPN, it works well.

@akoehn

This comment has been minimized.

Copy link
Member

commented Apr 9, 2019

@seeledu, does google.com work without VPN? If not, that's the problem :-/

@knmnyn

This comment has been minimized.

Copy link
Collaborator

commented Apr 9, 2019

Cross-ref #244 Google products are blocked by the Great Firewall of China occasionally (more often than not). This is why relying on G products for international services that have a large membership in CN (i.e., ACL) is not usually a good idea.

As ACL is an international organization, we may want to think more about this issue (and maybe get the official word from the ACL Exec). Previously the Rails search allowed a pretty comprehensive search within the system, but now perhaps searching the static site would be easier in some ways.

@Franck-Dernoncourt

This comment has been minimized.

Copy link

commented Apr 10, 2019

On the author page, e.g. https://www.aclweb.org/anthology/people/y/yan-song/, we can only see the count for the top 5 most frequent venues (it used to be possible to view the count for all venues):

image

@yucc2018

This comment has been minimized.

Copy link

commented Apr 11, 2019

papers(PDF full) in the following pages can't be downloaded.
https://www.aclweb.org/anthology/events/cl-2018/

@malikalamgirian

This comment has been minimized.

Copy link

commented Apr 14, 2019

Where can I find the appendices? I could not find the appendices to the papers anywhere for EMNLP 2018.

@mbollmann

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 14, 2019

Where can I find the appendices? I could not find the appendices to the papers anywhere for EMNLP 2018.

Supplementary material can be accessed via the green buttons. For EMNLP 2018, they all seem to be labeled "Attachment". If that doesn't answer your question, can you clarify?

@malikalamgirian

This comment has been minimized.

Copy link

commented Apr 14, 2019

I still could not find the attachments for this paper https://aclweb.org/anthology/papers/D/D18/D18-1514/

It mentions on page 4805 that Appendix A.2 contains some related materials, but I still could not find the Appendix or the attachments anywhere.

@mbollmann

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 14, 2019

I still could not find the attachments for this paper https://aclweb.org/anthology/papers/D/D18/D18-1514/

There don't seem to be any additional files for this submission on the server, so my first guess would be that despite what they write in the paper, the authors didn't actually provide an appendix. Maybe someone involved in ingesting EMNLP 2018 can dig into this more, but my assumption is that you'd have to contact the authors.

@mjpost

This comment has been minimized.

Copy link
Member

commented Apr 15, 2019

Yes, it seems these authors forgot to submit it. If you contact them, you could let them know they could send it to me for uploading by creating an Issue here.

You can see what attachments look like on other papers, e.g., https://aclweb.org/anthology/papers/D/D18/D18-1512/

@yucc2018

This comment has been minimized.

Copy link

commented Apr 16, 2019

can anyone supply NAACL 2019 papers?

@mjpost

This comment has been minimized.

Copy link
Member

commented Apr 16, 2019

NAACL 2019 papers will be available in the Anthology on on June 2, 2019.

@aryamccarthy

This comment has been minimized.

Copy link
Collaborator

commented Apr 17, 2019

The main page of the anthology is slightly wider than all other pages. Moving to one of those other pages means the top bar contents jump closer together.

@mbollmann

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 17, 2019

The main page of the anthology is slightly wider than all other pages. Moving to one of those other pages means the top bar contents jump closer together.

Yes, that's because the overview tables are so wide and should use all the space to prevent scrolling. I don't think there's a good way to address this without a complete overhaul of how we present all the conferences on the main page.

@jwtaki

This comment has been minimized.

Copy link

commented Apr 22, 2019

the searching does not work?

@akoehn

This comment has been minimized.

Copy link
Member

commented Apr 22, 2019

@jwtaki are you in a country that blocks google by any chance? The search functionality is provided by them so search does not work in China for example.

@allanj

This comment has been minimized.

Copy link

commented Apr 23, 2019

New suggestion: it would be better to (have an option to) divide the papers in each year into different areas. Right now, we have a lot of papers different from 10 years ago when we could still read all of them.

I tried to read most of the papers but I have to select some of them. Thus, the area of interest is the key factor for me and other readers to read. Probably.

@akoehn

This comment has been minimized.

Copy link
Member

commented Apr 23, 2019

@allanj the information is not stored in the meta data, so someone would have to classify all papers before any frontend logic could be written.

That is a lot of work and it is often not clear at all what categories should be used and one that is decided, which category a paper belongs to. Look at the high rates of papers moved between areas after the authors have selected a category.

@TobiasLee

This comment has been minimized.

Copy link

commented Apr 25, 2019

It would be better to support searching for papers by institute name.

@aryamccarthy

This comment has been minimized.

Copy link
Collaborator

commented Apr 25, 2019

@TobiasLee Like categories, this isn't stored in the metadata. We'd have to either extract it from the PDFs (which is noisy) or do it manually (which is so large as to be infeasible).

And on a personal note, I worry about the consequences of providing search by institution. Our community has become large enough that we can't read every paper, and search by institution may lead to a rich-get-richer bias in which papers get read. That's not a problem in itself; the problem is the other side of the scale: high-quality papers will be overlooked because of the institution they come from. (I believe that authors write papers, not institutions. The same could happen with author search, which is supported, but it's harder to make that systemic and entrenched.)

@TobiasLee

This comment has been minimized.

Copy link

commented Apr 27, 2019

@aryamccarthy Thanks for your considerable reply.

@Evpok

This comment has been minimized.

Copy link

commented Apr 29, 2019

@TobiasLee Like categories, this isn't stored in the metadata. We'd have to either extract it from the PDFs (which is noisy) or do it manually (which is so large as to be infeasible).

Consequences aside, GROBID is quite efficient at extracting such metadata from pdfs. HAL uses it to prefill metadata for new deposits.

@mikhovr

This comment has been minimized.

Copy link

commented May 8, 2019

It seems that search results aren't displayed in my browser.
E.g. https://aclweb.org/anthology/search/?q=cross-lingual
Firefox Quantum 60.3.0esr (32-bit)
And it seems everything's okay in Chrome.
изображение

@mayhewsw

This comment has been minimized.

Copy link
Contributor

commented May 14, 2019

This is a tiny tiny thing, but it always bothers me that the yellow banner on top ("You're viewing the latest version...") is too close to the header.

If you have the time and patience for such trivia, you can fix this by adding padding to the parent div (container).

<div class="container" style="padding-top:15px">
  <aside class="alert alert-warning text-center py-1 mt-n3 mt-md-n4 mt-xl-n5" role="alert">You're viewing the latest version of the ACL Anthology.
    <a class="btn btn-warning mx-2" href="https://github.com/acl-org/acl-anthology/issues/170">Give feedback</a>
  </aside>
</div>
@mjpost

This comment has been minimized.

Copy link
Member

commented May 14, 2019

Fixed live and I agree it looks better. Want to submit a PR? hugo/_default/baseof.html I believe.

@mbollmann

This comment has been minimized.

Copy link
Collaborator Author

commented May 14, 2019

Fixed live and I agree it looks better. Want to submit a PR? hugo/_default/baseof.html I believe.

Please use Bootstrap classes instead of custom style attributes, though. (Or play around with removing/changing the explicit negative margin classes of the banner, which is probably the more correct way of doing it.)

@iyuge2

This comment has been minimized.

Copy link

commented May 18, 2019

awesome, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.