Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve asteroidal triples code #19337

Closed
dcoudert opened this issue Oct 3, 2015 · 30 comments
Closed

Improve asteroidal triples code #19337

dcoudert opened this issue Oct 3, 2015 · 30 comments

Comments

@dcoudert
Copy link
Contributor

dcoudert commented Oct 3, 2015

Fix some types problems raised in #19334 for asteroidal_triples.pyx.
In particular, we get rid of uint32_t whenever possible.

Depends on #19334

CC: @nathanncohen

Component: graph theory

Author: David Coudert

Branch/Commit: u/dcoudert/asteroid @ e32015d

Issue created by migration from https://trac.sagemath.org/ticket/19337

@dcoudert dcoudert added this to the sage-6.9 milestone Oct 3, 2015
@dcoudert
Copy link
Contributor Author

dcoudert commented Oct 3, 2015

Branch: u/dcoudert/asteroid

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 3, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

aa2f9b3trac #19337: better comments

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 3, 2015

Commit: aa2f9b3

@dcoudert

This comment has been minimized.

@dcoudert
Copy link
Contributor Author

dcoudert commented Oct 3, 2015

comment:3

Using int should be enough for the size of graphs we can process.

@jdemeyer
Copy link

jdemeyer commented Oct 3, 2015

Changed dependencies from #19334 to none

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 3, 2015

comment:5

Err... int are usually larget than 32bits ?...

@jdemeyer
Copy link

jdemeyer commented Oct 3, 2015

comment:6

Replying to @nathanncohen:

Err... int are usually larget than 32bits ?...

In fact, no. On all systems I know, ints are exactly 32 bits.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 3, 2015

comment:7

In fact, no. On all systems I know, ints are exactly 32 bits.

Funny. I had somewhere in my head that int and long were the same size.

Anyway, why should we downgrade those uint32 to int exactly? O_o

Nathann

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 3, 2015

comment:8

It seems from the comments on the other ticket that you would only need to replace "-1" with "-1L"?..

@jdemeyer
Copy link

jdemeyer commented Oct 3, 2015

comment:9

Replying to @nathanncohen:

In fact, no. On all systems I know, ints are exactly 32 bits.

Funny. I had somewhere in my head that int and long were the same size.

on 32-bit systems that's true :-)

Note that the C standard doesn't say much about this. On Windows for example, long is always 32-bit (even on 64-bit systems). There might be systems where int is 64-bit.

@vbraun
Copy link
Member

vbraun commented Oct 3, 2015

Dependencies: #19334

@vbraun
Copy link
Member

vbraun commented Oct 3, 2015

comment:11

The only guarantee by the C spec is that int is signed and at least 16 bit.

I'd recommend the C99 int_fast32_t type which will be guaranteed to be at least 32 bit and the fastest on the target system. IMHO you should never use the old int / long except to match legacy libraries, its bad style and very hard to write 100% standards-compliant code.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 3, 2015

Changed commit from aa2f9b3 to e32015d

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 3, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

92b960btrac #19337: Merged with 6.9.rc2
e32015dtrac #19337: try with int_fast32_t

@dcoudert
Copy link
Contributor Author

dcoudert commented Oct 3, 2015

comment:13

So let's try with int_fast32_t if it is better.

@jdemeyer
Copy link

jdemeyer commented Oct 4, 2015

comment:14

Replying to @vbraun:

I'd recommend the C99 int_fast32_t type

-1.

Why should this graph code use a type of at least 32 bits? Where does the arbitrary number 32 come from? Why not int_fast16_t or int_fast64_t then?

If you don't want int, I'd go with long then which is also guaranteed to be at least 32 bits and which matches the actual return type of bitset_first_in_complement().

you should never use the old int / long except to match legacy libraries, its bad style and very hard to write 100% standards-compliant code.

Really, why that? I would say that you should always use types like long unless you need a specific bit-length.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 4, 2015

comment:15

Why should this graph code use a type of at least 32 bits? Where does the arbitrary number 32 come from? Why not int_fast16_t or int_fast64_t then?

I did not check this specific code but uint32_t probably comes from the data structure "static_sparse_graph", which uses them.

Then, as David said, this algorithm does not need uint32_t -- it would not run. The 31 bits of int32_t would do, and 16 bits would probably be sufficient.

Really, why that? I would say that you should always use types like long unless you need a specific bit-length.

Here we want to specify the bit length, because the smallest the type is the better it will be cached. We will read memory very very often in that code.

Nathann

@jdemeyer
Copy link

jdemeyer commented Oct 4, 2015

comment:16

About the "fastest on the target system": that very much depends on the application. On 64-bit systems, it could easily be a 64-bit type. If you're making arrays of those (it seems that you do), you can end up slower than int32_t because of the larger memory accesses.

PS: I realize that we're bikeshedding here. I'm not really against int_fast32_t, I just wouldn't recommend it in this case. I would advice: if you're certain(!) that 32 bits is enough, use int32_t. Otherwise, use long.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 4, 2015

comment:18

About the "fastest on the target system": that very much depends on the application. On 64-bit systems, it could easily be a 64-bit type. If you're making arrays of those (it seems that you do), you can end up slower than int32_t because of the larger memory accesses.

Yes yes, I agree.

PS: I realize that we're bikeshedding here. I'm not really against int_fast32_t, I just wouldn't recommend it in this case. I would advice: if you're certain(!) that 32 bits is enough, use int32_t. Otherwise, use long.

Sooo... Would you say that there is actually anything that needs be changed in this file? We use fixed-size types, they are unsigned because that's what the data structure expects, and the change in Cython has been fixed by Volker with a cast.

Nathann

@dcoudert
Copy link
Contributor Author

dcoudert commented Oct 4, 2015

comment:19

Hello,

the algorithm has time complexity O(n^3) and space complexity O(n^2). So we will certainly not use it on graphs with 100,000 nodes (I can't on my laptop).

I was using uint32_t since static_sparse_graph uses it, but I agree that long is more appropriate with bitset_first_in_complement(). So I can use long for all indexes, or a more appropriate type if any.
For the arrays, all values stored in that array are in range 1..n. So either I use a 16 bits type and add a test at the beginning of the method in case n needs more than that, or I use any type with at least 32 bits. Well, uint32_t is good for that.

Let me know.

David.

@vbraun
Copy link
Member

vbraun commented Oct 4, 2015

comment:20

If you need only 16 bits then there is of course int_fast16_t. Its definition is analogous to plain int except that the latter has no promise of being a particularly fast choice. I don't care about the actual bitwidth, it should just be reflected (and easily readable) in the code.

Replying to @jdemeyer:

If you don't want int, I'd go with long then which is also guaranteed to be at least 32 bits

But not guaranteed to be particularly fast.

and which matches the actual return type of bitset_first_in_complement().

I.e. the "legacy libraries" case.

you should never use the old int / long except to match legacy libraries, its bad style and very hard to write 100% standards-compliant code.

Really, why that? I would say that you should always use types like long unless you need a specific bit-length.

But by using long you are already asking for a particular bit width, except that it does not document the intent (how many Sage contributors know the specs for long?). Usually, you see long because

  • the author mistakenly believes that it is the longest int type
  • the author mistakenly believes that it can hold a pointer
  • the author mistakenly believes that it is at least 64-bit
  • the author mistakenly believes that it is 64-bit on a 64-bit machine
  • cargo cult, i.e., to match legacy code

With the C99 data types you just need to answer

  • what is the minimum bit-width required
  • do I need to optimize for space or speed
    and the resulting program inherently documents it. Its so much better, a total slam-dunk.

@jdemeyer
Copy link

jdemeyer commented Oct 4, 2015

comment:21

Replying to @vbraun:

Usually, you see long because

  • the author mistakenly believes that it is the longest int type
  • the author mistakenly believes that it can hold a pointer
  • the author mistakenly believes that it is at least 64-bit
  • the author mistakenly believes that it is 64-bit on a 64-bit machine
  • cargo cult, i.e., to match legacy code

I can play that game too:

Usually, you see int_fastN_t because

  • the author mistakenly believes that his code will never need more than N bits (think "640K should be enough for everybody")
  • the author mistakenly believes that int_fastN_t will yield the fastest possible code (not true if the code is memory-bound)

what is the minimum bit-width required

I don't think that one should always answer this question. There are valid use cases for types which have different sizes on 32-bit and 64-bit systems.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 4, 2015

comment:22

I can play that game too:

Go play it somewhere else. This thing is a trac ticket. Review it or exchange private emails.

Nathann

@vbraun
Copy link
Member

vbraun commented Oct 4, 2015

comment:23

Replying to @jdemeyer:

what is the minimum bit-width required

I don't think that one should always answer this question. There are valid use cases for types which have different sizes on 32-bit and 64-bit systems.

I agree that there are valid use cases, but

  • int/long are also terrible for that; They encode mostly encode historical accidents, e.g. long is different on 64-bit linux vs 64-bit windows on the same hardware
  • in Mathematics, correctness and reproducibility (including on different OS/hardware) are much more important than elsewhere. If you can't trust that the result is correct then neither speed nor memory usage matters.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 5, 2015

comment:24
  • in Mathematics, correctness and reproducibility (including on different OS/hardware) are much more important than elsewhere. If you can't trust that the result is correct then neither speed nor memory usage matters.

This, right after you decided to not include a stopgap in a code that returns wrong answers? That made me laugh.

@dcoudert
Copy link
Contributor Author

dcoudert commented Oct 9, 2015

comment:25

Today, with sage 6.9.rc3 and without this patch, I have a segfault!

With this patch it's working.

I don't know which is the best type to use and above discussion confuses me.

What I do know is that this patch solves a segfault and so is needed.

Best,
David.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 9, 2015

comment:26

Do you know if the segfault is deterministic? Does it happen consistently or may it be related to something else? I do not see how changing the type may fix anything that is not already fixed by #19334.

Nathann

@dcoudert
Copy link
Contributor Author

comment:27

sorry, I thought that #19334 was already merged.
So, with #19334 I don't have segfault. I hope it will be merged shortly.

@dcoudert
Copy link
Contributor Author

comment:28

We can certainly close this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants