-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix browsers blocking GET requests on files/html due CORS #94
Conversation
The updatejQueryContext is updated to prepend a localhost:3000 address on all URL requests. Besides, all extractors requeted paths are updated. Before, they started on /test/pages/<...>. Since this is only used for HTML pages under /test/pages/, I think we could remove that part. The server can actually have the root on that directory, recuding the length of paths in the code
The previous attempt was launching the web server programatically and then using Node.js process APIs to launch a browser on `src/tests.html` file. However, I realized that http-server CLI has the option to open a file, thus a single command can launch the webserver and open the default browser with the main tests page. This probably already works cross-platform, while my attempt would require special handling in Linux/Windows/Mac I'am afraid... oh well, sounds like a good approach.
One of the tests was failling due timeouts. Thus, I slightly increased the timeout for that particular test to 5 seconds
The previous heuristic was to get all <tr> of a table and if the length is >= 2, then it is valid. The reasoning was one <tr> is for the header, thus we need two extra <tr> that should be the table body. Since there is a method that gathers the table header rows, the new approach is smarter because it does not assume a single header row. Computing the diff <all rows> - <header rows> gives the data/body rows, which then is used for validation
I am converting this to a draft because I detected a potential conflict with datatables and other extractors. For instance, the exam extractor. I will look into it later today. I will probably push into this PR because I did some fixes in datatables here, soo.. just preventing git conflicts |
So, some of the Sigarra code uses tables for layout. Very old technique, but still there. For instance, in the degree plan, it is possible to see multiple nested tables, and only the last one in the tree shows the UCs, ECTCs, etc.
By extracting some footers, for example showing a total value or similar, it would be possible to support more data tables. The heuristic is fairly simple: if the last table has unmatching columns with the table body, then wrap around it <tfoot>.
I noticed that sometimes DataTable starts modifying the DOM and fails later in process. Which means, the DOM suffered modifications, e.g. filter/buttons, sorting buttons, etc (which funny enough can work to some extent in some columns, but not all..). In such cases, the proposed change tries to aggressively restore the DOM as if there was no attempt to apply DataTable. Its achieved by cloning the table before any pre-processing, and then restoring the DOM if needed.
Issue went away. Indeed applying DataTable was clashing with the bills extractor. But since DataTable fails, after performing the cleaning, the bills extractor still works. Nonetheless, I think we should review all extractors contexts and block DataTables if (a) data tables don't make sense on some specific page (i.e. exams page) or if there is any chance of conflicts (e.g. bills, which in fact we know in advance it will fail anyway). Will do that tomorrow |
I noticed a few issues in some tables, so I reverted the heuristic used. Instead, it relies in a class used by Sigarra.
failure Before it was like: - Clone original table DOM - Try to apply DataTable - If fails, try to restaure the original DOM, cloned in first step The recovery is not that easy because the DataTable can fail at very different stages. Perhaps, it just added the wrapper containers for the buttons (csv, excel, ...) and search input. But I found cases where it goes further and inserts sorting event listeners and the buttons in some columns. There are also scenarios where it fails in very early stages, adding just a few classes in 'class' attribute. Thus, it becomes difficult to ensure a clean DOM like the original one. With this commit, the procedure is: - Clone original table - Try to pre-process and apply DataTable in the cloned version - If OK, then replace the original table in the DOM with the new DOM containing filter/search components, the DataTable itself, etc. Advantages for the new approach are: - 100% clean DOM upon failure - More performant. The DOM in the webpage is manipulated once per table on success. In case of failure, theres no change at all saving rendering/parsing cycles.
changes - Main issue was infinity scroll querying DOM elements at the constructor before the DataTable extractor completing its job. Therefore, the DOM would modify and the refereces in Infinite Scroll would be invalid - Another issue was the loading gif missing after the first load. The reason is when the DataTable is updated with new data, the DOM is reconstructured and the <img> node for the GIF is destroied. The fix is to attach this element in the DOM everytime. - This also changes how data is added to an existing table. There is no need to destroy the previous table. The API allows to pass the new elements and then redraw the table. Besides, you can pass HTML elements as input (i.e. <tr>)
In some pages we know in advance that data table or infinite scroll simply wont work and could conflict with other extractors. The manifest is updated to exlude the said extractors in specific pages: bills, exams and time table
invoked The '$' selector was creating a new DOM object everytime is invoked. Imagine an extract that uses '$' to select some DOM objects and modify the dom. A subsequent use of '$' would create a new DOM from the original HTML file and previous changes would be lost.
<table>
node and making all pre-processing and DataTable init in the cloned version. Only when there is a guarantee the table is valid, the DOM is updated.