-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
errors in modular form data #1223
Comments
@jwbober : Can you do a first attempt to separate these out by the actual issue? (Whatever you're doing to check, you can sort them by the problem, right?) Starting from the top... 10.11.7.b, 10.11.7.c: this means that the form is level 10 and weight 11, right? I agree this form is not in the database, but it's not advertised as being there. 10.12.9.a: This one appears Let's break this up into cases. |
Please note that Fredrik mentioned before that probably all of these that On Sat, 7 May 2016 at 14:06 jvoight notifications@github.com wrote:
|
When I do the same comparison but only use a1 through a8 I get fewer errors, but I still get a list of 600. I will go through this list and try to produce examples of various problems that come up. I got my list of forms by manually querying the database by modifying some code Fredrik sent me. I don't know why in some cases I found a form in the database but the space of forms is not available from the website. |
|
From http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/93/2/ I can click to reach the space http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/93/2/16/, but that page produces a server error. |
http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/58/2/57/ just has completely wrong coefficients. It should start q + a0_q^2 + a0_q^3 - q^4 + q^5 - q^6 - 2_q^7 - a0_q^8 + 2*q^9 + O(q^10) (where a0^2 = 1, so at least the number field is correct.) |
93/2/16 looks fine after #1222 which I am checking now. |
I find it very hard to believe that a_9 is not computed correctly. Don't we use standard Sage for the one-off computations of data to be inserted? |
@jwbober Sorry to be devil's advocate but why should we believe that your own data is correct? Is it just the computation of a_n for composite n which we have wrong in the database? |
It is not all composite n. For 93/2/16 there is a problem with a4, but a9 is correct. I will see what happens when I try to run comparisons using just prime coefficients. For http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/58/2/57/, I copy-and-pasted Sage output, which agrees with what I've computed. I'm pretty sure that my data is correct, because Dave Platt has verified the Riemann Hypothesis at small height for all of it. The comparisons I'm making do have some problems, though. For example, I notice now that some spaces don't have all of the Galois orbits in the LMFDB, which will mess up the comparisons I've made. |
Good answer! |
@fredstro: This one is very strange. The eigenvalues record is very old - On Sat, 7 May 2016 at 14:52 Jonathan Bober notifications@github.com wrote:
|
@fredstro We have to trace back all of these data issues to their source On Sat, 7 May 2016 at 15:26 Stephan Ehlen stephan.j.ehlen@gmail.com wrote:
|
Is it possible that when we made the big switchover to the new modular form database format, some of the old-style forms were left in and hence not recomputed and are now causing trouble? I may be speaking nonsense, It is not hard to write a script to go through a collection and delete all items created before (or after) a certain data. the mongo _id field knows a lot about where it came from. |
I don't think this is possible because the "old" database is so different On Sat, 7 May 2016 at 15:30 John Cremona notifications@github.com wrote:
|
Yes -- missing data is better than wrong data. |
http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/15/5/11/ is another example that is completely wrong, as I can confirm with Sage. (it does not even have the correct dimension). I am going to stop looking at these one example at a time, and instead take the time to rewrite my comparison script to be a bit better. (And instead of trying to find the form in the LMFDB that looks like one of my forms, I will do it the other way around.) |
If I got to http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/3/73/2/ I see the q-expansion for 3.37.2.b up to O(q^10), but if then click on http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/3/73/2/b I just see q+\alpha*q^2. Is that as intended? |
In this case you have to scroll down ;-) That’s more of a user interface issue but, honestly, for forms like this I’m not sure if displaying the q-expansion is of any use to anyone
|
+1 We need an automated way to verify the q-expansions we are displaying On 2016-05-07 15:46, Jonathan Bober wrote:
|
@sehlen my bad :). I agree this is a user interface issue that is not critical. There are others, btw, for example the bread you see on http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/3/73/2/b does not include an entry for http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/3/73/2. |
|
@sehlen Actually what I should have said is that I think there should be a terminal 'b', i.e. the bread should show the complete path to this page (and also it would be better to either have the word character in both pages or neither, I would vote for neither). |
Good point, @AndrewVSutherland and easy to fix.
|
@jwbober You have a few forms in here with trivial character. I checked the first few by hand and couldn't see a problem but then I'm not a machine. Could you maybe double check all those with trivial character and if there are in fact no issues (or only a few that we can fix easily), we should simply hide all non-trivial characters for Tuesday and I can spend the next week at AIM working on them and on data quality etc. instead of caring about new things to bring into the lmfdb, which I was supposed to do. Maybe some people would like to join me... |
+1 I think this is a very good idea. |
It's definitely "less bad" to hide wrong data than to display it, so if we are confident with Gamma_0, then I vote that we hide just Gamma_1 for Tuesday until the data can be verified. It's extremely important to get the MF to stand up straight in LMFDB. Even for new things: I can't compute base change HMFs, we can't match genus 2 curves of GL_2-type, etc., until classical MFs is stable and correct. I'd strongly encourage some subset of people to focus 100% on this next week. To "certify" the data, in addition to checking it using other sources, it would be nice to see the functional equation of the L-function certified in a couple of different ways--that's not a proof, but it's close to a gold standard. |
@sehlen It looks like http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/79/12/1/a/ occurs on my list because the embedding is not precise enough. I haven't verified that the coefficients are correct, but at least some digits of these ridiculously huge numbers are correct for a3. However, the 32nd embedding (at least) should be -211.15284..., but instead is -242.8339... (which is exactly what I get with Sage when I try to compute the embedding using only 53 bits of precision). This is just the first trivial character that I looked by hand. I'm running a better comparison with what I have now. (But there are some cases where I don't have data, e.g. I have nothing of my own for weight > 12.) |
The problems Bober found indeed has different reasons which I found on
@sehlen: I find it strange 5hat the computing routines doesn't work for Fredrik On Sat, 7 May 2016 21:44 jvoight, notifications@github.com wrote:
|
http://www.lmfdb.org/ModularForm/GL2/Q/holomorphic/29/4/1/a/ looks like the coefficients might be correct, but the embeddings are completely off. |
Of course, I forgot to mention that there are also errors that are there On Sat, 7 May 2016 22:12 Jonathan Bober, notifications@github.com wrote:
|
Trivial character is not so bad. There seem to be a small enough number of errors that I think I know what is wrong in each case. (I've done the comparison on weights 2 though 12 and level < 100, and weights 2 through 4 and level < 400. Above level 100 things get better for trivial character at least.) For the forms 29.4.1.a the embeddings are completely wrong. (Probably they start with a(101), as Fredrik suggests.) Perhaps the algebraic coefficients are correct. Anyway, those are probably small enough that the whole space can be recomputed quickly. For the forms 71.12.1.b I think I run into issues because the embeddings are not accurate enough. So probably the forms are basically correct, but the embeddings can be way off when only computed with 53 bits of working precision. My comparisons are not completely reliable right now, but are pretty good. Basically, for level < 100 and weight 2 through 12, and level < 400 or so and weight 2 through 4, and every character, I've (numerically) computed embeddings of every modular form, but I don't have algebraic information (e.g. Galois orbits). (In some cases, when I have enough precision, I can identify things as exact algebraic numbers, and get the algebraic information, but it is going to be impossible to always do this, and I haven't yet tried to do it systematically.) Anyway, I've used the embeddings that I grabbed from the database to try to match embeddings that I've computed. For a form in a given space, I use the first 30 coefficients (or just a(p) for p < 30) to select the closest form out of the forms I've computed in that space, using some sort of L2 distance (first normalizing coefficients so that they have absolute value < 1, to deal with some precision issues). The cases above are the ones where this distance is > .5 for at least one of the embeddings. I haven't checked yet that everything that should be in the database is there, but I have at least checked that for the forms that do match one of my forms, none of my forms is matched twice. My checks will not see any problems if there happen to be cases where the algebraic coefficients are wrong but the embeddings are correct. |
Some possibly useful stats on the 2450 spaces currently in modularforms2.webmodformspace that have trivial character. Of these, there is a complete set of (not necessarily correct) newform records in modularforms2.webnewforms matching the list of hecke_orbit_labels for all but 227 spaces. Among these 227 spaces, there are no hecke_orbit records at all for 222, in which case the user not be able to navigate to them (other than typing in the URL directly). The remaining 25 have some but not all of the newforms they should, which will mean the user can navigate to a page that will show some missing. I might suggest that for these 25 we either fill in the missing ones are delete the ones that are there. Here is a list of the 25: |
I started to work on this list but I am a bit confused because the majority seem in fact to be complete. Most importantly, I managed to fix the remaining ones, feel free to check: |
@jwbober The list you gave for trivial character should be fixed now, please rerun your checks. |
I just checked what sage gives when called directly and indeed increasing the precision for the higher level/weight cases gives substantially different results. I guess this has not been taken account of in the computation of the embeddings as of now. This should not be too hard to fix. |
I think our results differ because I was not filtering by version, but looking more closely I see that I should presumably only be looking at records in webmodformspace that have 'version' set to 1.3, since all the records in webnewforms have 'version' set to 1.3. I will update my script to account for this and rerun (which will then also reflect your recent updates). Part of the problem here is that there is no information listed in the github inventory for modular forms (or if there is, I can't easily find it). See LMFDB/lmfdb-inventory#13 (which I We really do need to have documentation that explains all the data in the mongo db that is actually used in the production system (the modularforms2 database is particularly confusing because there are so many different collections and different versions of records in each collections). On 2016-05-08 00:09, Stephan Ehlen wrote:
|
OK, I reran the script and can confirm that there are now no partially complete spaces with trivial character. There are a bunch that have no Hecke orbit data, which is fine since no link to the space is displayed in this case. The script is lmfdb/modular_forms/elliptic_modular_forms/emf_db_stats.py, which I added in pull request #1235 (in case you want to make any corrections or additions). It also checks that the Hecke orbits are labeled consistently, and when present, that the dimensions sum correctly (this was true in every case where there were Hecke orbits present). Below is the output of the script, which lists the spaces with missing data. I note that in every case the list of Hecke orbit labels has length equal to the dimension which I assume is just a temporary place holder (surely they do not all have dimension 1?!). This isn't visible to the user, so I don't see it as an immediate problem. No Hecke orbit data for space 7.32.1 of dimension 15 with 15 Hecke orbits |
I agree that the database modularforms2 is very confusing with so many I think it would be good to create a new database where only the On Sun, 8 May 2016 12:53 AndrewVSutherland, notifications@github.com
|
Separating the databases and also cleaning up the database for the website On Sun, May 8, 2016, 09:13 Fredrik Strömberg notifications@github.com
|
One thing I might suggest (not right now, but down the road) is that you distinguish versions at the collection or even database level, not at the record level. Having to filter by version (and including the version attribute in all your keys) slows things down and, more seriously, it invites mistakes. Suppose someone accesses the collection using the find_one() method (specifying a label they think should be a unique key) and does not specify the version attribute -- their code might well see a mixture of different versions that are just similar enough for their code to work. This could produce very subtle bugs. More generally, we should all bear in mind that the data we store in the LMFDB mongo db is public. Anyone (not just LMFDB developers) can access the data on a read-only basis through the LMFDB API (see http://lmfdb.warwick.ac.uk/api/). In the future there may very well be applications or users (e.g. from Sage math cloud) pulling data from the LMFDB mongo database that do not necessarily use any of our code to access it (they might not even be coding in python). As individual developers we may find it convenient to implement a python class that manages access data associated to a particular object and have our code view the associated mongo db data as "private" to that class, but there is nothing that forces others to do the same (not even with the LMFDB development community), nor do we want there to be. Any data augmentation or access control that is implemented in such a class as a layer above the data may be completely invisible to others, and we should all keep this in mind (think of your class as a "glass box" not a black box). |
Issue merged wiht #1248 |
_1 Upvote_ I have been trying to match modular form data in the LMFDB with data that I computed myself, so a few days ago I grabbed all of the complex embeddings of the q-expansions of all of the modular forms in the LMFDB. Below is a list of 1080 forms with weight between 2 and 12, and level < 100, that I have had trouble matching with my data.
This is not necessarily a comprehensive list of errors, and it is even possible that some of the forms in this list are fine, and I hope I am wrong about how much is wrong (possibly many of the cases with level < 50 have been fixed by now) but every form that I have manually examined has turned out to have problems. In some of them a9 is wrong for some reason. In at least two cases there are rational forms with the wrong coefficients. I think in another case there is a form that looks rational but shouldn't be. In other cases the page or the space gives an error, but maybe it can be clicked on. For others the page or space says it is not in the database.
Here is the list: boberlist.txt
The text was updated successfully, but these errors were encountered: