New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do not Merge][Discussion] Action Plan for improving solvers. #2948

Closed
wants to merge 3 commits into
base: master
from

Conversation

Projects
None yet
8 participants
@hargup
Copy link
Member

hargup commented Feb 21, 2014

This is PR is there to discuss the design aspect to improve the current solvers. I'm doing it here because I think github PR's are better for this purpose than mailing lists. People can comment on specific lines. We can write comments in markdown. We can ping specific people and also we can do revisions of the text.

TODO:

  • Discuss and conclude about the usage of sets as output of solve.
  • Complete the audit of the current solvers
  • Decide the submodules in which the current solve needs to broken.
  • Summarize the discussion on the mailing list, and answer the questions raised there.
  • Discuss the new algorithms to be implemented in solve.
@hargup

This comment has been minimized.

Copy link
Member

hargup commented Feb 21, 2014

@rlamy

This comment has been minimized.

Copy link
Member

rlamy commented Feb 21, 2014

@hargup I'm glad to see someone tackling this. I probably won't have time to help but I certainly approve of your plan, particularly concerning the first issue.

@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented Feb 21, 2014

I agree with @rlamy that this feels positive.

Question about the following line:

System of equations x + y == 2 and x - y == 0: FiniteSet((1, 0))

Sets alone might not be appropriate for multivariate problems. We also need to map back to the variables x, y. This is similar to the information missing/added when we move from sets to booleans.

edit: I see that later on you do mention that multivariate sets are an issue.

@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented Feb 21, 2014

I think that this discussion would be improved with a motivation section on why improvements to solve are good for SymPy. I don't doubt that this is true, but it will add weight to a GSoC proposal and should serve to help focus efforts on solve.

Lets assume that not all of your goals will be accomplished, what are the few most important ones for SymPy? Why?

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Feb 22, 2014

Don't forget about the assumptions system. It's very important, especially as you replace more and more parts of what you are doing with symbolic parameters.

Consider for instance the discussion at #2927, in particular, solve(abs(x) - y, x). The equation itself makes sense iff y is (real) nonnegative. In that case, the solution will be any complex x with sqrt(re(x)**2 + im(x)**2) = y (which doesn't say much, since sqrt(re(x)**2 + im(x)**2) is just another way to write abs(x)). There are infinitely many such x if y is positive: the solution is a circle in the complex plane of radius y. If x is limited to be real, the solutions are [y, -y].

That's a relatively simple example. You should consider how the assumptions will propagate to more complex ones.

Also, how can solve avoid even looking for certain kinds of solutions if certain assumptions are present? At #2897 we discussed skipping the non-rational polynomial solvers if the variable is set to be an integer (the discussion was on IRC). There's a related idea at #2715. In some cases, specifying a constraint may avoid otherwise exponentially many solutions.

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Feb 22, 2014

This is an interesting idea by the way, opening a pull request with a markdown file. Usually we just use an issue, along with the wiki. But this way, we can add line notes to the discussion.


2. Current solve isn't reliable enough. Currently it is not guaranteed that solve will
return all the solutions. The equations it solves. Or at least we should know when
it has not returned all the solutions. For example while evaluating imageset.

This comment has been minimized.

@asmeurer

asmeurer Feb 22, 2014

Member

Is it ever important to know that each of the solutions returned by solve are unique? This is a related but separate issue, which is more tied to the assumptions system. Consider for instance solve(x**2 - a**2, x). The solutions are [a, -a]. These are distinct solutions, unless a = 0. In that case, the solution is not entirely wrong if you count multiplicity.

You should think about in the cases where solve is used if two solutions being the same will result in a mathematically correct result or not. For instance, if we constructed Interval(-a, 0).union(Interval(0, a)), it would technically still be correct if a = -a. But if something fundamentally depended on the cardinality of the solution set, then it wouldn't.

@skirpichev

This comment has been minimized.

Copy link
Contributor

skirpichev commented Feb 22, 2014

Offtopic.

This is an interesting idea by the way, opening a pull request with a markdown file.

SEP (SymPy enhancement proposal) :)

We can create a separate repo for this. Every new PR - a separate markdown file. Otherwise, workflow can be very similar to PEP.

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Feb 23, 2014

We've always just discussed things on the list and made some wiki pages, but perhaps creating a (semi) formal SymPEP process would hurt. Also a lot of the major work on SymPy has been through GSoC, so GSoC proposals have also filled this role.

@hargup

This comment has been minimized.

Copy link
Member

hargup commented Feb 23, 2014

At #2897 we discussed skipping the non-rational polynomial solvers if the variable is set to be an integer (the discussion was on IRC).

Can you point me to IRC logs?

Don't forget about the assumptions system. It's very important,

When we are doing assume(Q.positive(x)) we are implicitly assuming that
x belongs to the set (0, oo), it would be helpful if we don't depend on the
context and make that assumption explicit in solve, with some parameter say inputSet,
which will decide the set of values the input variables can take. It will be easier to work with
assumptions this way as we can easily do the intersection of the solution set obtained from the generic solvers.

def solve(expr, vars, input_set):
    res = # obtain the results from generic solver
    return res.intersect(input_set)

Then we might develop different solvers for specific inputsets in the backend.
For example if the inputset is Integers we might just use the diophantine
equation solving module. Then we might also seperate the
the real solver and the complex solver. For example I don't think we will ever be able to solve things like sin(x) == x for x in complex domain, but can be sure that x=0 is the only real solution.

Also it would be great to have capability to directly pass a set to assume. Sets and booleans are basically the same thing, right?

btw this doesn't work

In [26]: x = var('x', integer=True)

In [27]: solve(sin(x*pi), x)
Out[27]: []
@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Feb 23, 2014

Can you point me to IRC logs?

http://colabti.org/irclogger/irclogger_log/sympy?date=2014-02-09#l48

When we are doing assume(Q.positive(x)) we are implicitly assuming that
x belongs to the set (0, oo), it would be helpful if we don't depend on the
context and make that assumption explicit in solve, with some parameter say inputSet,
which will decide the set of values the input variables can take. It will be easier to work with
assumptions this way as we can easily do the intersection of the solution set obtained from the generic solvers.

The assumptions system and the sets module are closely tied. I haven't thought fully about how they should be integrated yet. The sets cannot be used directly in the assumptions (at least not they way the work now), because they require "facts" to be written out, like Implies(Q.positive(x) & Q.positive(y), Q.positive(x*y)).

I imagine a lot of what the assumptions are capable of can also be done with the sets. I am a little more confident of the assumptions, because they use the SAT solver, which is capable of things like noticing inconsistencies (when I talk about the assumptions, I am mostly thinking of my work at #2508).

Then we might also seperate the
the real solver and the complex solver. For example I don't think we will ever be able to solve things like sin(x) == x for x in complex domain, but can be sure that x=0 is the only real solution.

That's a good point. Getting all solutions might not be possible in general, but it can be possible if we limit our domain. So you should think about how to express the "all solutions found" question so that it can be answered in the negative for general domains but in the positive for restricted domains.

Sets and booleans are basically the same thing, right?

A set is not a boolean, but the expression "x is an element of S" is a boolean.

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Feb 23, 2014

Another thing that isn't mentioned here, but which is important, is how to represent parameterized solutions. Solutions of diophantine equations are an example of this. So are the solutions to sin(x) = 1. Another example that came up recently is abs(x) - 1. The solutions are exp(I*theta), where theta is real (one might replace this with the more convenient exp(I*pi*theta)).

These can no doubt be represented using sets, but there are questions of usability, like how can the user easily get the parameters of the solution, and replace explicit values for them? The Diophantine module skirted this question by inventing its own output formats, but it should be unified with the rest of solve.

@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented Feb 23, 2014

We've talked about sets and assumptions in the past, in particular I
remember a PR in which someone wanted to extend new assumptions to handle
questions like

ask(Q.positive(x - 1), x > 3)

To solve it they messed around with the handlers in an unclean way. This
gave rise to the idea that we should have multiple mechanisms behind the
new assumptions interface. Sets could be one.

On Sun, Feb 23, 2014 at 11:25 AM, Aaron Meurer notifications@github.comwrote:

Another thing that isn't mentioned here, but which is important, is how to
represent parameterized solutions. Solutions of diophantine equations are
an example of this. So are the solutions to sin(x) = 1. Another example
that came up recently is abs(x) - 1. The solutions are exp(I_theta),
where theta is real (one might replace this with the more convenient
exp(I_pi*theta)).

These can no doubt be represented using sets, but there are questions of
usability, like how can the user easily get the parameters of the solution,
and replace explicit values for them? The Diophantine module skirted this
question by inventing its own output formats, but it should be unified with
the rest of solve.

Reply to this email directly or view it on GitHubhttps://github.com//pull/2948#issuecomment-35840278
.

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Feb 23, 2014

I haven't come up with a clear way in my head to do it with my ideas from #2508 yet, though. My thoughts so far have been:

  • It's probably a good idea to reduce all inequalities to Q.positive(expr) or Q.nonnegative(expr) (i.e., subtract the less than side from both sides). We then need some algorithm that generates useful facts from these expressions (useful meaning the SAT solver can make any kind of deduction we would want it to).
@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented Feb 23, 2014

I guess I'm proposing that the SAT system is only one of many inference backends used by new assumptions. Sets and polys contain a fair amount of logic for these sorts of problems. It would be nice to leverage them without having to know exactly how they work.

Currently new assumptions is both an interface (ask, Q) and a particular solution method (SAT). We should be able to break that open a bit.

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Feb 23, 2014

You may be right. So far, I've been trying to do as much with the SAT solver as possible, because if you break away from it, you lose all its inference power (and the ability to detect contradictions and the ability to work with logical expressions no matter how they are formatted).

But we shouldn't derail this discussion with the assumptions system. @hargup will need to assume that the assumptions system works as well as it can for his project, as fixing the assumptions is a whole project of its own (which isn't to say I would like to see him do some work here and there on the assumptions).

@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented Feb 23, 2014

But we shouldn't derail this discussion with the assumptions system

Agreed.

@hargup

This comment has been minimized.

Copy link
Member

hargup commented Feb 26, 2014

I'm almost done with the audit of solvers for univariate functions at https://github.com/sympy/sympy/wiki/solvers. @smichr can you review it. I'm not sure that I understand it completely and it possible that I have missed some parts.

@asmeurer will it be OK if I cannot complete the audit in the application period?

@hargup

This comment has been minimized.

Copy link
Member

hargup commented Feb 26, 2014

A set is not a boolean, but the expression "x is an element of S" is a boolean.

Oops, that's what I meant to say.

So you should think about how to express the "all solutions found" question so that it can be answered in the negative for general domains but in the positive for restricted domains.

In this a creating a Solution class might help, we can wrap it around the solution set with the largest domain where we are sure we have found all the solutions. Something like

In: solve(x**2 - 1, x)
Out: Solution(FiniteSet(-1, 1), S.Complexes)

In: solve(sin(x) + x, x)
Out: Solution(FiniteSet(0), S.Reals)

Another thing that isn't mentioned here, but which is important, is how to represent parameterized solutions.
These can no doubt be represented using sets, but there are questions of usability, like how can the user easily get the parameters of the solution, and replace explicit values for them?

We will be representing parameterized solutions by imageset and I think its interface is already convenient enough.

In [16]: i = imageset(t, pi*t, S.Integers)

In [17]: i.lamda
Out[17]: Lambda(t, pi*t)

In [18]: i.lamda.expr
Out[18]: pi*t

In [19]: i.lamda(0)
Out[19]: 0
@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Feb 27, 2014

@asmeurer will it be OK if I cannot complete the audit in the application period?

Probably, although the more you can do the better, since what you discover will change your course of action (i.e., the application itself). I think you're making good progress, though.

A set is not a boolean, but the expression "x is an element of S" is a boolean.

Oops, that's what I meant to say.

This is an important distinction, I think. We should make sure that functions that use booleans use booleans and functions that use sets use sets. For instance, solve should never return x > 3 as a solution, even a parameterized one, because it's not a set. Right now it does return a boolean sometimes (like solve(x**2 > 9)).

In this a creating a Solution class might help, we can wrap it around the solution set with the largest domain where we are sure we have found all the solutions.

That's not a very good name for the class, but this seems like a good idea. There may be better ways but I can't think of them right now.

We will be representing parameterized solutions by imageset and I think its interface is already convenient enough.

What if there are multiple parameters?

Also, what about the case of linear systems where the solution set can always be parameterized by one of the original variables (assuming its positive dimensional)?


1. We don't have a consitent output for various types of solutions

We need to return a lot of types of solutions consistently:

This comment has been minimized.

@skirpichev

skirpichev Feb 27, 2014

Contributor

You miss most important variant: "we don't know".

This comment has been minimized.

@hargup

hargup Feb 28, 2014

Member

We can do two things here, we can raise NotImplemented error and the other option is, as mentioned in the idea page, is creating and returning an unevaluated Solve object. One advantage of raising NotImplemented error is that it will
explicit tell the user that work on sympy is not yet complete and they might help us
improve it. @asmeurer can you tell me about some particular use cases of unevaluated solve object.

This comment has been minimized.

@asmeurer

asmeurer Feb 28, 2014

Member

Just representing that the solution is there is one use-case. If solve returns some Solution object, that gives more information than if it is returns [] or raises NotImplementedError: that it knows that there is a solution.

By the way, a nit-pick. It's NotImplementedError, not NotImplemented. This is important because the latter also exists in Python, but is a completely different thing.

This comment has been minimized.

@skirpichev

skirpichev Mar 14, 2014

Contributor

On Fri, Feb 28, 2014 at 07:11:22AM -0800, Harsh Gupta wrote:

improve it. [1]@asmeurer can you tell me about some particular use cases
of unevaluated solve object.

I'll try to answer.

We can use this object for bookkeeping of some problem.

Sometimes we can't solve the problem, but
0) we can transform original problem to some "canonical" form

  1. we can produce a numerical solution (evalf)
  2. or as a series (e.g. if we can solve the problem exactly for some
    particular parameter value)
  3. we can deduce some assumptions (e.g., that solution set is finite, all
    solutions are positive or integers)

However, this is not a specific issue for solve, we already have Limit,
Sum, Integral, ImageSet and so on.

@skirpichev

This comment has been minimized.

Copy link
Contributor

skirpichev commented Feb 27, 2014

On Sun, Feb 23, 2014 at 11:31:38AM -0800, Matthew Rocklin wrote:

We've talked about sets and assumptions in the past, in particular I
remember a PR in which someone wanted to extend new assumptions to handle
questions like

ask(Q.positive(x - 1), x > 3)

To solve it they messed around with the handlers in an unclean way.

Probably, it's pr #1907

@skirpichev

This comment has been minimized.

Copy link
Contributor

skirpichev commented Feb 27, 2014

On Wed, Feb 26, 2014 at 11:03:37AM -0800, Harsh Gupta wrote:

In this a creating a Solution class might help, we can wrap it around the
solution set with the largest domain where we are sure we have found all
the solutions.

Bad idea. It's better to explicitly set up this domain, as an argument for
solve(), for example. Like assume parameter in sympy.solvers.inequalities.

In: solve(x**2 - 1, x)
Out: Solution(FiniteSet(-1, 1), S.Complexes)

In: solve(sin(x) + x, x)
Out: Solution(FiniteSet(0), S.Reals)

Solution of what?

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Feb 28, 2014

Bad idea. It's better to explicitly set up this domain, as an argument for
solve(), for example.

I think the point is that solve can tell us that it only knows how to solve in some limited domain. Requiring the user to specify it doesn't make sense, because it's really a factor of what solve knows.

@skirpichev

This comment has been minimized.

Copy link
Contributor

skirpichev commented Mar 1, 2014

On Fri, Feb 28, 2014 at 02:19:23PM -0800, Aaron Meurer wrote:

I think the point is that solve can tell us that it only knows how to
solve in some limited domain.

This is not specific for solve. For example, sometimes we can
evaluate limit if some parameters is positive. Same for integrals,
sums...

Requiring the user to specify it doesn't
make sense, because it's really a factor of what solve knows.

It does, as it would be ridiculous to return answer for some new
problem and a set of assumptions...

But this may work in slightly other way, like summation now (i.e., return
some Piecewise instance).

@hargup

This comment has been minimized.

Copy link
Member

hargup commented Mar 1, 2014

That's not a very good name for the class

Agreed, then what about returning a tuple (<solution set>, <domain>) ?

It's better to explicitly set up this domain, as an argument for
solve()

Requiring the user to specify it doesn't
make sense, because it's really a factor of what solve knows.

It does, as it would be ridiculous to return answer for some new
problem and a set of assumptions.

Explicitly setting up the domain will help us in reducing the search space for
the solution plus I feel that it will also be helpful in implementation and not just
in user interface.

But with that we should also return the domain of surety of the solution.
For example to solve log(x + 1) + log(x - 1) the function is simplification to
log(x*(x+1)) assuming x > 1, since the simplification is only valid for x in
(1, oo) we are not confident there is no solution in other domain, say for x in
real. So, in this case we should not expect the user to explicitly set up the
domain as (1, oo).

So, the interface should be something like

In[1]: solve(<expr>, inset=<input set>)
Out[1]: (<solution set>, <Intersection(domain of surety, input set)>)
@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Mar 1, 2014

I think this is analogous to returning a piecewise expression. In fact, perhaps we can rethink @hargup's idea so that it literally is just a Piecewise. Piecewise((all the solutions, x in some domain), (maybe not all the solutions, otherwise)).

@skirpichev

This comment has been minimized.

Copy link
Contributor

skirpichev commented Mar 2, 2014

But with that we should also return the domain of surety of the solution.
So, the interface should be something like
In[1]: solve(, inset=)
Out[1]: (, <Intersection(domain of surety, input set)>)

It would be very surprising (to be polite) to return the answer for different problem.

So, in this case we should not expect the user to explicitly set up the
domain as (1, oo).

Why not? If solve is not successful with the default domain - user may try to refine assumptions.

@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented Mar 4, 2014

I think that some of this discussion, while very valuable, is altogether separate from the original issue of how @hargup should formulate his application for GSoC. I think that we should move it to a separate issue.

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 4, 2014

On 3/4/2014 10:21 AM, Harsh Gupta wrote:

Is sqrt(4) the set {-2,2}? If you say it is simply 2, I think you
are in big trouble.

You also mentioned this in your sets paper. Well, sympy currently
returns only 2 and we simplify sqrt(y**2) to abs(y) given y is rea..
Can you elaborate on the reason to advocate returning set {-y, y} for
sqrt(y) ?


Reply to this email directly or view it on GitHub
#2948 (comment).

You can define sqrt(4) to be just 2.
You can define arcsin(1) to be just pi/2
You can define log(x) to be log(abs(x)) in case x<0

But then you will not have the correct properties expected of these
functions in the more sophisticated setting that prevail in, say
complex variables courses in college.

Draw a graph of y=abs(x) and you see a v-shape. That is not an algebraic
curve.
How can it be the solution to an algebraic equation y^2=x ?

Draw a graph of the locus of points satisfying y^2=x. There are two
curves, and neither
one is shaped like a v.

For a choice of branch cut for sqrt(r), why does it matter what r is?
unless r=0, there are two values. Even if r is imaginary or complex.

Can you explain why sqrt(y^2) should be abs(y) I have 3 explanations:

  1. "when I was
    in high school, I think someone told me that"?
  2. "I'm a physicist and
    I can always check my results against physical reality so that if I make
    a mistake here I can always come back with my eraser and fix it. Oh, I can't
    erase in the computer??"
  3. For a while, under some conditions, other programmers made the same
    mistake
    in Mathematica, Maple, Maxima, until someone stomped on them. So then they
    made a half-assed effort to fix it up.

RJF

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Mar 4, 2014

Or is sympy unable to rely on interaction?

No. One of the very important use-cases of SymPy is that it be usable as a library, meaning zero user interaction.

Regarding your critique of Python, note that Q.integer(3) and S.Naturals are, except for their particular syntactic representation, our invention, not Python per se. The two are not tied together at the moment, but that's more an issue of lack of things being implemented in the assumptions than design.

Regarding sqrt(2), in my opinion, this sort of things is better left to the polys module, where sqrt(2) is represented by its minimal polynomial y**2 - 2 (note that not a lot of algebraic stuff is implemented there yet, so a lot of this also just means the future polys module).

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Mar 4, 2014

@rfateman I would love to hear your thoughts on what I am doing at #2508, if you have the time. The goal is to be able to write things like ask(Q.zero(x) | Q.zero(y), Q.zero(x*y)) (given that x*y is zero is it true that x is zero or y is zero). The method I am using is to generate a bunch of relevant "facts" from the given expressions, and to dump those into the SAT solver. The advantages are that we can write down facts in a very high level language (see https://github.com/sympy/sympy/pull/2508/files#diff-bb2c25daed16ce71d13be11d912725edR286), that the assumptions can handle arbitrary logical expressions, and that the system is "self protecting" against logical inconsistancies. The disadvantages so far are mainly about performance (it's too slow to generate the clauses; this seems to be a common theme with SAT solvers). I'm curious how Maxima handles assumptions, and if you think it is done well (and if not, how you would do it if you were to start again).

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Mar 4, 2014

  1. Some of your objectives are unattainable since (for example) there is no algorithmic decision process to determine if f(x,y,z) is identically zero. Given that, f(x,y,z)*x +3 =0 may or may not have no solution whatsoever.

Sure. Algorithmic issues aside, this can also come up with any f for which we haven't yet written down the logic of when it has zeros.

But if there is a class of expressions for which we can do this, and that class can be made large enough to be computationally useful, then I think we should do what we can. There definitely is such a class (e.g., it at least contains univariate polynomials, and I think it can probably be made larger than that).

  1. There are many kinds of things that you could feed to solve. It would be much much better if you considered particular classes of problems and how to solve them with particular programs. After all, almost anything in a CAS can be addressed as input to some "solve" as... solve().

Yep. That's why we have this issue "solve() is a giant mess.

  1. The role of assumptions in doing some parts of solve "right" is extremely important. The fact that assumptions were added after the fact in Macsyma was a disaster. Unfortunately repeated in Maple, and Mathematica.

SymPy has a good assumptions system, but it is limited in what it can express. Our attempt to extend it has been shaky. I'm hoping to get it "right" at #2508.

  1. There are too many pieces to return comfortably. Multiple solutions, multiplicities, extra conditions (restrictions on parameters), extra added parameters. Expect this to be messy.

This is something we are trying to figure out here. If you have any suggestions, we'd love to hear them.

  1. Some solutions should be burped out in pieces with names.

I don't understand what this means.

  1. some solutions are too bulky to ever display and should be abbreviated.

SymPy doesn't typically do this, though we could probably extend the printer to do it. But also keep in mind what I said before about SymPy being primarily a library, not an application.

  1. It is highly likely that, in the tradition of building CAS, you are standing not on the shoulders of giants, but on their feet. It just might pay to look at the code to see how (for example) cubic and quartic equations are handled. And if people ask only for the real solutions, you should realize that you just might be able to tell.

Yes, likely. The use of esoteric languages (yes, to many of us, lisp is esoteric) doesn't help either.

  1. You might consider offering numerical solutions, in which case there is a huge literature even for linear systems. If you don't use someone else's canned library you are pretty much dooming yourself to reinventing the wheel.

Matthew already discussed this. SymPy already has a big enough scope being a symbolic library. There is a very well developed numeric ecosystem in Python. We should not implement numeric routines, but rather ways to pass off SymPy expressions to the already well built numeric solvers that exist in Python.

I want to thank you for taking the time to comment here. It's great to hear the opinions of those who are well knowledged in the problems we are trying to solve.

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 4, 2014

On 3/4/2014 2:04 PM, Aaron Meurer wrote:

Or is sympy unable to rely on interaction?

No. One of the very important use-cases of SymPy is that it be usable
as a library, meaning zero user interaction.

I think that detracts from its usefulness, but this is a point of
contention in the Maxima design. It perhaps asks
the user too many questions.

Regarding your critique of Python, note that |Q.integer(3)| and
|S.Naturals| are, except for their particular syntactic
representation, our invention, not Python per se. The two are not tied
together at the moment, but that's more an issue of lack of things
being implemented in the assumptions than design.

Regarding sqrt(2), in my opinion, this sort of things is better left
to the polys module, where sqrt(2) is represented by its minimal
polynomial |y**2 - 2| (note that not a lot of algebraic stuff is
implemented there yet, so a lot of this also just means the future
polys module).

It might be advisable theoretically, but users might object. It is a
dilemma what to do.
People will want to write sqrt(2), not RootOf(y^2-2).
Incidentally, you can specify a particular sqrt(2) (or whatever) if you
have isolating intervals for
the roots of y^2-2=0, and specify one such interval. This solves the
problem of whether
sqrt is the set of all square roots (the unadorned Rootof) or a
particular one, which might be..e.g.

RootOf(y^2-2,  {0,infinity})    would be the positive one.

   I suppose this could be generalized to

RootOf(sin(y)), but that might not be the nicest way to specify arcsin,
either.

Maybe the point here is that there is a design / usability issue that
has to be
considered in addition to the mathematical one.

Of course if the mathematics is wrong, the design is, uh, not going to
save it.


Reply to this email directly or view it on GitHub
#2948 (comment).

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 4, 2014

On 3/4/2014 2:08 PM, Aaron Meurer wrote:

@rfateman https://github.com/rfateman I would love to hear your
thoughts on what I am doing at #2508
#2508, if you have the time. The
goal is to be able to write things like |ask(Q.zero(x) | Q.zero(y),
Q.zero(x_y))| (given that x_y is zero is it true that x is zero or y
is zero). The method I am using is to generate a bunch of relevant
"facts" from the given expressions, and to dump those into the SAT
solver. The advantages are that we can write down facts in a very high
level language (see
https://github.com/sympy/sympy/pull/2508/files#diff-bb2c25daed16ce71d13be11d912725edR286),
that the assumptions can handle arbitrary logical expressions, and
that the system is "self protecting" against logical inconsistancies.
The d isadvantages so far are mainly about performance (it's too slow
to generate the clauses; this seems to be a common theme with SAT
solvers). I'm curious how Maxima handles assumptions, and if you think
it is done well (and if not, how you would do it if you were to start
again).


Reply to this email directly or view it on GitHub
#2948 (comment).

I just glanced at this stuff.

  1. I think you are right that a SAT solver won't scale.
  2. There are several ways of handling assertions that are
    discussed at great length in the artificial intelligence
    literature. I think that searching for truth maintenance
    will get you a bunch.
  3. I don't know much about the Maxima assumption database,
    except that some people find it disappointing. It was written
    initially by Michael Genesreth, now a professor at Stanford.
  1. I think the Maple system takes a more studied approach which
    is to allow linear inequalities, e.g. a+b>0 and solve the
    geometric problems derived from questions.
    I found this link:
    http://www.maplesoft.com/support/help/Maple/view.aspx?path=SolveTools/Inequality/LinearMultivariateSystem
  2. This stuff is different from
    (a) non-linear polynomials
    (b) non-algebraic stuff (x^y, exp, log, sin, abs, ...)
    (c) (usually) Type stuff e.g. integer, real, ..
    I say usually because it IS possible to assert that n is an
    integer by assume(sin(n*pi)=0) but that is probably a bad idea.

I think that in a hands-off batch library you are already expecting
users to compose well-formed computations. You could ask them to
specify which of several assumption systems to use, and be very
careful and specify what they can do. Since several of them
are theoretically plausible and uncommonly time-consuming, you
don't want to just pick (the wrong) one and come back a week later.

Note that a simple 'loop' can have an arbitrarily hard computation
in the termination test.

while (f(a)>0) do ....
or even
while (a>0) do ...

This was certainly a major issue with Macsyma.
What tools to use
for is(), if()then else etc.
What do you do if the condition a>0 is not provably true or false and
it appears in if(a>0) then dothis else dothat?
Maxima has one mode in which it is an error, and another
in which the value returned is the unevaluated "if (a>0) ..."

Some issues were resolved by using different commands..

is (equal(a,0)) is different and potentially more time-consuming than
is (a=0).

Macsyma (not Maxima) had a decision procedure that could determine
a maximum precision needed to determine if a complicated expression
was zero or not. Mathematica does this too. I think the code might
have been written (both places) by Bill Gosper. I don't have a reference
for it.

I think Godel's theorem makes it impossible to determine all
inconsistencies unless by "arbitrary logical expressions" you
mean something less interesting. But maybe I'm not following this.
I haven't studied your files. Some people really like the kinds
of deductions done by Prolog. It has the advantage of being
rather easy to program.

Sorry if this is in the wrong thread. I'm ok for you guys to move it or
copy it.
I don't know 'bout git discussions.
RJF

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Mar 5, 2014

By the way,

univariate polynomial real roots, complex roots
linear systems
algebraic systems (polynomial equations)
single equations, non-polynomial, univariate or multivariate
inequalities
minimization
linear programming
ordinary differential equations
partial DEs
diophantine eqs

Most of these are already implemented. The issue here is more about unifying their interfaces than implementing the solvers themselves, though I would love to see more solvers implemented as well (I was a little disappointed last semester when SymPy couldn't solve a very basic interest equation that I was having my college algebra do (basic like one of these guys). Anyway, most solving is heuristical, meaning there are always new algorithms to implement and cases to cover.

@certik

This comment has been minimized.

Copy link
Member

certik commented Mar 5, 2014

Can you report an issue with the exact equation that it cannot solve?

Sent from my mobile phone.
On Mar 4, 2014 6:34 PM, "Aaron Meurer" notifications@github.com wrote:

By the way,

univariate polynomial real roots, complex roots
linear systems
algebraic systems (polynomial equations)
single equations, non-polynomial, univariate or multivariate
inequalities
minimization
linear programming
ordinary differential equations
partial DEs
diophantine eqs

Most of these are already implemented. The issue here is more about
unifying their interfaces than implementing the solvers themselves, though
I would love to see more solvers implemented as well (I was a little
disappointed last semester when SymPy couldn't solve a very basic interest
equation that I was having my college algebra do (basic like one of these
guyshttp://qrc.depaul.edu/studyguide2009/notes/savings%20accounts/compound%20interest.htm).
Anyway, most solving is heuristical, meaning there are always new
algorithms to implement and cases to cover.


Reply to this email directly or view it on GitHubhttps://github.com//pull/2948#issuecomment-36701152
.

@asmeurer

This comment has been minimized.

Copy link
Member

asmeurer commented Mar 5, 2014

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 5, 2014

On 3/4/2014 5:34 PM, Aaron Meurer wrote:

By the way,

univariate polynomial real roots, complex roots
linear systems
algebraic systems (polynomial equations)
single equations, non-polynomial, univariate or multivariate
inequalities
minimization
linear programming
ordinary differential equations
partial DEs
diophantine eqs

Most of these are already implemented.

This kind of list is really not informative.

...
For example, "implementing ordinary differential equation" -- does it do
everything
in Kamke? Does it know when to use Laplace transforms?

or does it just have a piece of documentation that says
"ODESolve" (syntax) "implements ODEs" ?

or something in between?

The issue here is more about unifying their interfaces

I suppose that is of some use, though I don't really have an idea of the
sympy
user community (other than some inkling that Sage would like sympy to
replace
Maxima).

than implementing the solvers themselves,

This is reminiscent of a kind of design in human factors. It's
called Wizard of Oz experiments. You make believe you have programs that do
something, and worry about the interface. You fake the programs' operation
with humans, if need be.

It seems that if sympy had the math routines, other people would use it
to build different front ends. Like Sage. I would expect many different
front ends to be developed if you have sufficiently many non-experts
doing summer projects. Typically non-experts (who know little math)
write routines for plotting, parsing, display, web pages.

I was under the perhaps mistaken impression that sympy would have a mission
something like ginac
http://www.ginac.de/About.html
except replace C++ with python and maybe include more stuff and have a
crowd-programming model.

though I would love to see more solvers implemented as well (I was a
little disappointed last semester when SymPy couldn't solve a very
basic interest equation that I was having my college algebra do (basic
like one of these guys
http://qrc.depaul.edu/studyguide2009/notes/savings%20accounts/compound%20interest.htm).
Anyway, most solving is heuristical, meaning there are always new
algorithms to implement and cases to cover.

If sympy has no advanced features and only offers superficial
symbolic manipulation at relatively slow speed, then I would expect
the novelty of it being "in python" would be an insufficient
motivation for people to use it. The goal should be for sympy to
be better than the competition. But maybe I underestimate the
sales benefits of being in python.
RJF


Reply to this email directly or view it on GitHub
#2948 (comment).

@skirpichev

This comment has been minimized.

Copy link
Contributor

skirpichev commented Mar 5, 2014

On Tue, Mar 04, 2014 at 01:37:13PM -0800, rfateman wrote:

  1. I don't think that python is easy to read. Or write -- see my syntax
    error in previous msg.

You shouldn't be an expert in the field, but it's assumed you know
the language syntax, if you want to contribute.

Whether python is easy to read (and to contribute) - a subjective
question. I think, it is - python has much vider audience, c.f. lisp's
in the world. And there are reasons for this, including simple and clean syntax.

Is an imageset something like a
characteristic function? Presumably I could learn the answers by reading
something, but you are claiming it is easy for the non-expert.

Have you tried to look to the documentation of imageset (just type "imageset?"
in the isympy shell)?

I think I still qualify as a non-expert in python :) There is way too much
syntax, and it does not
look to me like the syntax of ordinary math.

Well, this is valid for lisps too. I think we should agree that python
is closer to the syntax of ordinary math. By that reason, I guess, there is
a secondary language in the Maxima, right?

At least lisp has the benefit
of consistency: you probably haven't seen this before, but once you learn
about parenthesized prefix notation it is all the same.

I don't think it's true. In fact, lisps have a lot of syntax
rules, e.g. something like (if p c a) in Scheme is very different
from the usual function call.

 1. I am much less sanguine about the value of contributions by
"non-experts". For example, non-experts often contribute "bug reports"
of which the majority are "user errors". Identifying them as such is a
burden on more-expert users.

If the project has too many such bug reports - it may be a sign of
weakness in some of its interfaces or in the documentation.

   Sometimes non-experts contribute bug
   fixes or patches that are bad news. They fix a small glitch but
   introduce a long-term more subtle problem.

It up to reviewer whether commit these patches or not:
https://github.com/sympy/sympy/wiki/Development-workflow#wiki-reviewing-patches

After all, an expert in the field can be a newbie in the project codebase and just
by that reason - can introduce new and more complex problems. Should we
reject contributions from experts just for that reason?

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 5, 2014

On 3/5/2014 5:05 AM, Sergey B Kirpichev wrote:

On Tue, Mar 04, 2014 at 01:37:13PM -0800, rfateman wrote:

  1. I don't think that python is easy to read. Or write -- see my syntax
    error in previous msg.

You shouldn't be an expert in the field, but it's assumed you know
the language syntax, if you want to contribute.
This is quite backwards. To contribute to sympy you should be an expert in
some content-related subject. If the python language is simple to
understand
(presumably both syntax AND semantics) then the thesis of the project
could be
that you can learn it in a very short time. (Not always true; some
mathematicians
have great difficulty learning to program well. Dunno why.)

Whether python is easy to read (and to contribute) - a subjective
question.
Yes
I think, it is - python has much vider audience,
Probably not a good idea to judge "ease" by "popularity".
Is Windows easier than Unix?
c.f. lisp's
in the world. And there are reasons for this, including simple and
clean syntax.
I think that Lisp, especially the Scheme dialect, being almost devoid of
syntax rules, has a
simpler syntax. Python is simple only if you think that a superficial
(but false) equivalence
to ordinary mathematics means simple, and that those aspects totally
dominate the
rest of the language which is pretty much arbitrary programming language
stuff.

Is an imageset something like a
characteristic function? Presumably I could learn the answers by reading
something, but you are claiming it is easy for the non-expert.

Have you tried to look to the documentation of imageset (just type
"imageset?"
in the isympy shell)?
No. If something is simple if you read and understand the documentation,
then
presumably flying a 747 is simple. (Maybe it is?) I assume the 747 is
documented.

I think I still qualify as a non-expert in python :) There is way
too much
syntax, and it does not
look to me like the syntax of ordinary math.

Well, this is valid for lisps too. I think we should agree that python
is closer to the syntax of ordinary math.
It is a false equivalence though. At least it is my believe that
python's "+"
doesn't mean the addition of ordinary math, but 32 or 64 bit addition,
which is confusingly NOT the syntax of ordinary math.

Now if you use long ints (with an L suffix) it is more
math-like, but the suffix L is made-up novel syntax. Also 3/2 is
changed to1.
This is hardly the syntax of ordinary math.

Also, if it were so simple, people learning ordinary math would not have
to remember the implied precedent of operators like + and . And when
in the course of ordinary math did you encounter *
?

Face it, python is just another programming language with a bunch of
arbitrary decisions imbedded in it. In my admittedly limited experience,
people who think that python is simple or natural have limited or no
other experience
with programming languages.

By that reason, I guess, there is
a secondary language in the Maxima, right?
There is a surface language in Maxima that contains lots of command function
names and looks more like conventional math with an Algol-60-ish programming
style. It allows a+b. The command foo() calls the underlying lisp
function $foo.
The data is essentially explicitly typed trees, though there are a
number of
special forms, and ways of communicating with external programs.

A programmer can use either of the languages or both.

The idea that a CAS should have only one language for the user and for the
system developer has been explored for nearly half a century.

It is my impression that sympy in any case is not python e.g. the
Natural Number
example, so I don't know if it is one or two languages. Or even if the
sympy
developers are aware of the history of the issue.

At least lisp has the benefit
of consistency: you probably haven't seen this before, but once you
learn
about parenthesized prefix notation it is all the same.

I don't think it's true. In fact, lisps have a lot of syntax
rules, e.g. something like (if p c a) in Scheme is very different
from the usual function call.

Yes, there are a handful of special forms. they all follow the SAME
syntax...
except the shorthand for (quote x) as 'x. I think the complete list of
special
forms in Scheme is

lambda, let, letrec,let*, define, set!, quote, quasiquote, if, case,
and, or, begin, do.

I have personally never used quasiquote, case, begin, do in scheme.

The ones that I have used are easy to teach to students... They are used
mostly in setting up bindings and defining
functions. The logical and and _or _do short-circuit evaluation and so
therefore
do not have the same evaluation rules as normal functions. They evaluate
until
the answer can be determined, e.g. (or e1 e2 true e3) never evaluates e3.

So these are arguably all rather simple. Note that contrary to python,
(+ 1/3 1/3) comes out as 2/3, and integers are all of arbitrary precision .

This is not to say lisp ignores details, e.g. when floating-point
numbers overflow, etc.
But these kinds of issues must be addressed (or ignored) in almost every
language
of sufficient scope.

  1. I am much less sanguine about the value of contributions by
    "non-experts". For example, non-experts often contribute "bug reports"
    of which the majority are "user errors". Identifying them as such is a
    burden on more-expert users.

If the project has too many such bug reports - it may be a sign of
weakness in some of its interfaces or in the documentation.
Sure. I think it is an especially foolish assignment of tasks to have
a newbie "write the documentation". So how do you fix this?
an expert must fix this.

A newbie writing documentation will generally have an incomplete
idea of the objectives, the methods, and the pitfalls of an elaborate
piece of code. His/her documentation thus consists of a probably-biased
view of part of the functionality. I recall one case in which newly
produced
documentation consisted entirely of exceptions to the proper expected
functionality. (Documentation of the edge cases is important but
should not dominate the description!)

Sometimes non-experts contribute bug
fixes or patches that are bad news. They fix a small glitch but
introduce a long-term more subtle problem.

It up to reviewer whether commit these patches or not:
https://github.com/sympy/sympy/wiki/Development-workflow#wiki-reviewing-patches
Yes, so that means that all the patches are correct??

After all, an expert in the field can be a newbie in the project
codebase and just
by that reason - can introduce new and more complex problems. Should we
reject contributions from experts just for that reason?
Of course it is worthwhile to carefully review all contributions before
incorporating
them into some distributed system. There is a much better potential payoff
if the contribution has some "content" rather than code representing some
novice's idea of how to fix a bug he/she may not really understand.

I am not by any means saying that subject-matter experts are necessarily
going to be good programmers. The evidence is overwhelming that (say)
mathematicians can be bad programmers. Or not.


Reply to this email directly or view it on GitHub
#2948 (comment).

@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented Mar 5, 2014

In my limited experience arguments about the simplicity or natural understanding of programming languages are rarely productive. I'm going to try to say some consensus building things.

First, I have a great respect for Lisp dialects. Most of my experience is with Clojure rather than Common Lisp. From an objective point of view Lisp dialects are clearly simpler, this does not mean that they are objectively easier (as noted, this is hard to judge). For systems like SymPy that manipulate expression trees there are a lot of obvious benefits. A mature SymPian should know Lisp, even if he does day-to-day work in Python.

That being said, the fact is that Lisps are not the dominant language paradigm taught or used in most fields today. Algol-like languages happen to be more popular. To that extent, Algol-like CASs may be more approachable to today's workers. I'm not saying that this is ideal, merely that this is the state of things.

SymPy's approach has added value to society. From anecdotal evidence (talking to people at conferences) I've gathered that a big part of this value comes from how easy SymPy is to install and how familiar SymPy seems to them given their current training. Lets also note that they rarely need anything fancy; often they want a catalog of special functions, the ability to take derivatives, and the ability to generate native code. For the audience of people-who-don't-know-any-better, I think that SymPy is actually pretty close to the optimal solution. There exists a different audience though, of people-who-know-better, and for them SymPy fails to impress. For example I rarely see (haven't seen) SymPy mentioned in the CAS literature.

I think that a lot of the current argument depends on ones values. If we can accept that different value systems are valid and that there exists multiple audiences then I think a lot of the argument can die away. In service of computer algebra, I'll say that SymPy has engaged a somewhat new audience, and that in the long run this is good for computer algebra.

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 5, 2014

I do not know the extent to which the python ecosystem has been adopted
in engineering and applied mathematics circles. There are obviously
some enthusiastic promoters (including at Berkeley). Some of them are
computer scientists and teachers. If you meet numpy and scipy enthusiasts
at conference devoted to python, it is hardly surprising.

Moving engineering education at UC Berkeley from using Fortran to
using Matlab was significant progress. Moving from Matlab to python?
I don't know how that might happen. Symbolic manipulation (computer
algebra) has had a mixed reception by the Matlab company.
Initially they felt it was unnecessary, and the engineers seemed to
agree. Then Matlab added a symbolic toolkit using Maple. Then
they switched out Maple in favor of another symbolic system.

Will scientists and engineers (computational supercomputer users, whatever)
adopt python because of sympy? Will they reject it because sympy
is not (yet) state of the art? Probably not either. Do these engineers
(etc) care about free software? I think that it is not possible to make
a simple characterization of this.

Anyway, I think the competition for mind share of these people is
Matlab.

I have heard that while Mathematica thinks it competes with Matlab,
Matlab thinks it competes with Excel.

If sympy people learn from their experiences and produce python software
that
is useful to other people, and then give it away, it is hard to say
if that is the best use of their time, but it is presumably in everyone's
best interests (certainly my best interest) to have a good design
and implementation of the software.

On 3/5/2014 8:03 AM, Matthew Rocklin wrote:

In my limited experience arguments about the simplicity or natural
understanding of programming languages are rarely productive. I'm
going to try to say some consensus building things.

First, I have a great respect for Lisp dialects. Most of my experience
is with Clojure rather than Common Lisp. From an objective point of
view Lisp dialects are clearly /simpler/, this does not mean that they
are objectively /easier/. For systems like SymPy that manipulate
expression trees there are a lot of obvious benefits. A mature SymPian
should know Lisp, even if he does day-to-day work in Python.

That being said, the fact is that Lisps are not the dominant language
paradigm taught or used in most fields today. Algol-like languages
happen to be significantly more popular today. To that extent,
Algol-like CASs may be more approachable to today's workers. I'm not
saying that this is ideal, merely that this is the state of things.

SymPy's approach has added value to society. From anecdotal evidence
(talking to people at conferences) I've gathered that a big part of
this value comes from how easy SymPy is to install and how familiar
SymPy seems to them given their current training. Lets also note that
they rarely need anything fancy; often they want a catalog of special
functions, the ability to take derivatives, and the ability to
generate native code. For the audience of
people-who-don't-know-any-better, I think that SymPy is actually
pretty close to the optimal solution. There exists a different
audience though, of people-who-know-better, and for them SymPy fails
to impress. For example I rarely see (haven't seen) SymPy mentioned in
the CAS literature.

I think that a lot of the current argument depends on ones values. If
we can accept that different value systems are valid and that there
exists multiple audiences then I think a lot of the argument can die
away. In service of computer algebra, I'll say that SymPy has engaged
a somewhat new audience, and that in the long run this is good for
computer algebra.


Reply to this email directly or view it on GitHub
#2948 (comment).

@skirpichev

This comment has been minimized.

Copy link
Contributor

skirpichev commented Mar 5, 2014

On Wed, Mar 05, 2014 at 07:30:13AM -0800, rfateman wrote:

This is quite backwards. To contribute to sympy you should be an expert in
some content-related subject.

In any case, you should know the language too.

I think that Lisp, especially the Scheme dialect, being almost devoid of
syntax rules

And a lot of semantic rules, just like the python.

Is an imageset something like a
characteristic function? Presumably I could learn the answers by
reading
something, but you are claiming it is easy for the non-expert.

Have you tried to look to the documentation of imageset (just type
"imageset?"
in the isympy shell)?
No.

I'm sorry, but there is no way to avoid this in general.

Well, this is valid for lisps too. I think we should agree that python
is closer to the syntax of ordinary math.
It is a false equivalence though. At least it is my believe that
python's "+"
doesn't mean the addition of ordinary math, but 32 or 64 bit addition,
which is confusingly NOT the syntax of ordinary math.

Actually, it may mean this too. But python has builtin type for
arbitrary precision integer arithmetic too (the only integer type for python3).

Also 3/2 is changed to1.
This is hardly the syntax of ordinary math.

The good news that + or / - arbitrary operators:
http://docs.python.org/2/reference/datamodel.html#object.__add__
So, we can overload these operators for good or evil:
In [1]: n=Integer(3)
In [2]: m=Integer(2)
In [3]: n / m
Out[3]: 3/2
In [4]: type(_)
Out[4]: sympy.core.numbers.Rational

By that reason, I guess, there is
a secondary language in the Maxima, right?
There is a surface language in Maxima that contains lots of command
function
names and looks more like conventional math with an Algol-60-ish
programming
style. It allows a+b. The command foo() calls the underlying lisp
function $foo.
The data is essentially explicitly typed trees, though there are a
number of
special forms, and ways of communicating with external programs.

A programmer can use either of the languages or both.

I think, it's an additional barrier, which we don't have (thanks to
python). There is no such a barrier in Mathematica or Maple as well.

The idea that a CAS should have only one language for the user and for the
system developer has been explored for nearly half a century.

Please point me why I'm wrong. See below references to Mathematica
and Maple for illustration of the trend.

It is my impression that sympy in any case is not python e.g. the
Natural Number example, so I don't know if it is one or two languages.

No doubt, it's the same language. Integer(10) is a python
object, we don't add any new syntax or semantics.

Sometimes non-experts contribute bug
fixes or patches that are bad news. They fix a small glitch but
introduce a long-term more subtle problem.

It up to reviewer whether commit these patches or not:

https://github.com/sympy/sympy/wiki/Development-workflow#wiki-reviewing-patches
Yes, so that means that all the patches are correct??

No. Unless today I woke up in a perfect world...

I just point you to tools what we use. Can you suggest
something better then peer-reviewing?

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 5, 2014

On 3/5/2014 8:34 AM, Sergey B Kirpichev wrote:

On Wed, Mar 05, 2014 at 07:30:13AM -0800, rfateman wrote:

This is quite backwards. To contribute to sympy you should be an
expert in some content-related subject.

In any case, you should know the language too.

I think that Lisp, especially the Scheme dialect, being almost
devoid of syntax rules

And a lot of semantic rules, just like the python.
To the extent that Lisp is a functional language, or is used as a
functional language,
and the same for python, the meaning of programs should be easy to
understand. The term "functional language" is a technical term. I don't
know about "semantic rules" though.

Is an imageset something like a characteristic function?
Presumably I could learn the answers by
reading
something, but you are claiming it is easy for the non-expert.

Have you tried to look to the documentation of imageset (just
type "imageset?" in the isympy shell)?
No.

I'm sorry, but there is no way to avoid this in general.

Well, this is valid for lisps too. I think we should agree that
python is closer to the syntax of ordinary math.
It is a false equivalence though. At least it is my believe that
python's "+" doesn't mean the addition of ordinary math, but 32 or
64 bit addition, which is confusingly NOT the syntax of ordinary
math.

Actually, it may mean this too. But python has builtin type for
arbitrary precision integer arithmetic too (the only integer type for
python3).
If the rules for python continue to change, incompatibly, that is quite
unfortunate, don't you think? My understanding is that in the recent past,
some programs ceased to work. Free, open, broken.

Also 3/2 is changed to1. This is hardly the syntax of ordinary
math.

The good news that + or / - arbitrary operators:
http://docs.python.org/2/reference/datamodel.html#object.__add__ So,
we can overload these operators for good or evil: In [1]:
n=Integer(3) In [2]: m=Integer(2) In [3]: n / m Out[3]: 3/2 In [4]:
type(_) Out[4]: sympy.core.numbers.Rational

The fact that you have to go outside the standard data model and fix things
suggest that the standard is broken.

(two or one language)

I think, it's an additional barrier, which we don't have (thanks to
python). There is no such a barrier in Mathematica or Maple as well.
False. The Mathematica system see
http://reference.wolfram.com/mathematica/tutorial/TheSoftwareEngineeringOfMathematica.html

has hundreds of thousands of lines of C+ or C code.

Maple has a kernel written in another language. (used to be Margay, I
think).

The idea that a CAS should have only one language for the user and
for the system developer has been explored for nearly half a
century.

Please point me why I'm wrong. See below references to Mathematica
and Maple for illustration of the trend.

I didn't see any references, but here is one from SSIGSMNewsletter, vol
8 issue 2 1974
The SCRATCHPAD Language
by R. D. Jenks, IBM
The abstract:

SCRATCHPAD is an interactive system for symbolic mathematical
computation. Its user language, originally intended as a special-purpose
non-procedural language, was designed to capture the style and
succinctness of common mathematical notations, and to serve as a useful,
effective tool for on-line problem solving. This paper describes
extensions to the language which enable it to serve also as a high-level
programming language, both for the formal description of mathematical
algorithms and their efficient implementation.

http://dl.acm.org/citation.cfm?doid=1086830.1086834

note: this is not the first paper on this general topic. Nor the
last. 1974 is only 40 years ago.

Generally there is an ambiguity between x the symbol and x the
container/name for a value
and x an object of a certain type and x the name of a type and x a
mathematical category.
and perhaps other aspects.

Many programming languages have variables which must be initialized
before use.
Lisp allows you to deal with symbols, and quote them. This does not
solve all
problems in representation, but it is a leg up on other languages in
which one
must construct symbol tables, store print-names, properties, etc.

It is my impression that sympy in any case is not python e.g. the
Natural Number example, so I don't know if it is one or two
languages.

No doubt, it's the same language. Integer(10) is a python object, we
don't add any new syntax or semantics.

is NaturalNumber.S a python object?

Sometimes non-experts contribute bug fixes or patches that are
bad news. They fix a small glitch but introduce a long-term
more subtle problem.

It up to reviewer whether commit these patches or not:

https://github.com/sympy/sympy/wiki/Development-workflow#wiki-reviewing-patches

Yes, so that means that all the patches are correct??

No. Unless today I woke up in a perfect world...

I just point you to tools what we use. Can you suggest something
better then peer-reviewing?

It seems to me that you don't want peer-reviewing
if the person submitting the patch is a newbie. Getting a review from
another
(peer) newbie is not so good.

I would expect "expert-reviewing" would result in higher quality,
generally.
It may not be better because you may not have enough experts and so
it might not get done promptly or at all.

It might be possible to use crowd-sourcing in some kinds of review process,
but that doesn't seem plausible for sympy.

— Reply to this email directly or view it on GitHub
#2948 (comment).

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 5, 2014

(possibly pounding on a dead horse, but ...)

What does "+" mean?

You might think that C:=A+B can be converted into some kind of
object oriented detailed specification that would be a call to some
function that
looked like this
Let f be the unique subroutine entry point for "+" based on this
information: {type_of_A, type_of_B, type_of_C}
then call f(A,B).

Unfortunately, there are many ways of adding objects, and they may not
be distinguishable based on
the type data. We need more information before we can be sure we have
the right "f"..

For example, in Maxima there are _some _times when you CAN determine the
specific meaning of
+, when A and B are both floats, both integers, or various other well
known cases where
there are rules for "contagion" that most people agree on. _Though not
entirely. _
For example, one can convert from floating-point to an (exact) rational
number.

But if A,B are (say) polynomials, there are several different ways of
adding them using
different kinds of simplification, or even conversion to different data
representations.
At the top level these are sometimes controlled by the setting of global
variables.
In writing programs in Lisp, there may be alternate specifically chosen
entry points.

These are not solvable by choosing a nicer language; they appear to be
an inevitable
complexity of the task at hand. They are not complicated in Maxima or
other languages
because the Maxima system/language is unnecessarily complicated. It is
in the nature
of an ambitious symbolic manipulation program that such choices are/
need to be/ available.
If your language is incapable of revealing these choices to the user,
then it is probably
not so appropriate for symbolic manipulation. Maybe OK for a toy demo
though.
Which is why so many toy symbolic systems run into trouble when
programmers try
to make them serious.

RJF

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 6, 2014

Just thinking about sqrt and abs.

Here's an equation: solve x^2=y^2 for x. You will probably get {x=-y,
x=y}
arguably, sqrt(y^2) covers both, but if your system doesn't think of
sqrt as a set of 2 values, ...
which is correct.

Now consider taking sqrt of both sides..

If that gets |x| = |y|, then you can solve that for x and get

{x=-|y| , x=|y|}, which is a correct solution too.

If you draw the curve for x=|y|, you do not get the same curve as
either x=y or x=-y.
and similarly for x=-|y|.

So the transformation

sqrt( x^2=y^2 ) to |x| = |y|

seems to be an incorrect transformation.

You probably knew this, but I thought I'd give a really simple example.

Solving for x in exp(i_pi_x) = exp(i_pi_r) is also related to
powers and branches.
One solution is x=r.
A better one is x=r+2*n for integer n.
Maxima gets the latter from to_poly_solve, the former from the older
solve.

RJF

@hargup

This comment has been minimized.

Copy link
Member

hargup commented Mar 7, 2014

An algorithm discussed on the mailing list is using the properties of the
derivatives of the function. Say we are given a continuous and differentiable function f(x) and it's derivative w.r.t x is g(x).
Then by Rolle's Theorem if g(x) has n solutions then f(x) cannot have more than n + 1 solutions.

For example for solving sin(x) == x we don't have any analytical method to
solve such equation, but we know x == 0 is a solution as Intersection(sin(x)==0 and x==0) is x==0
and we can show this is only solution if we consider the derivative of the
function that is cos(x) - 1 which is always non negative. Implies the
function is monotonic.

A concern raised by @asmeurer was avoiding infinite recursion because we will
also need to solve the derivative. This problem can be easily avoided by having
a hierarchy of solvers. Another issue with this is showing that the function is
continuous. I think think the problem is doable for a general class of
univariate functions. We had some discussion of this at #2723 (comment) and #2925. I also
found some discussion on Maxima mailing list http://comments.gmane.org/gmane.comp.mathematics.maxima.general/38804

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 7, 2014

For a symbolic manipulation system, an algorithm that works only for
functions of a single variable is certainly insufficient, and possibly
worse. If you said "we can solve(f,x)" but really all you can solve
is functions f which are f(x), one variable, no parameters, I would
consider it a deceptive claim that you had a symbolic solve program.

The problem of finding one or more zeros of a function of a single variable
is well explored in the numerical world, providing sure-fire, rapid, or
both, convergence.
Look up Brent's method.

Now you may say, what if the user asks for a symbolic result? Well, if
the symbolic
result is a constant, elaborately written with radicals and log and
such, is it really
so useful? or would a numerical approximation, offered to a few decimal
places,
be as useful? Say the approximation was offered to any precision, on
request?
(In an interactive system it would be possible to offer a button "push
this button for
10 more digits in the answer")

So far as I know, this is not done by any CAS but could be.

In general, I see no way of proving continuity accurately. You can
certainly show
that some functions are continuous.

You might also look up Sturm sequences to see how Rolle's theorem (etc) can
be used to find guaranteed bounds on isolating intervals for zeros of
polynomials
(of a single variable). This is useful if you don't trust polynomial
zero-finders based
on numerical methods.

RJF

On 3/6/2014 9:54 PM, Harsh Gupta wrote:

An algorithm discussed on the mailing list is using the properties of the
derivatives of the function. Say we are given a /continuous/ and
/differentiable/ function f(x) and it's derivative w.r.t x is g(x).
Then by Rolle's Theorem if g(x) has n solutions then f(x) cannot have
more than n + 1 solutions.

For example for solving |sin(x) == x| we don't have any analytical
method to
solve such equation, but we know |x == 0| is a solution as
|Intersection(sin(x)==0 and x==0)| is |x==0|
and we can show this is only solution if we consider the derivative of the
function that is |cos(x) - 1| which is always non negative. Implies the
function is monotonic.

A concern raised by @asmeurer https://github.com/asmeurer was
avoiding infinite recursion because we will
also need to solve the derivative. This problem can be easily avoided
by having
a hierarchy of solvers. Another issue with this is showing that the
function is
continuous. I think think the problem is doable for a general class of
univariate functions. We had some discussion of this at #2723
(comment)
#2723 (comment) and
#2925 #2925. I also
found some discussion on Maxima mailing list
http://comments.gmane.org/gmane.comp.mathematics.maxima.general/38804


Reply to this email directly or view it on GitHub
#2948 (comment).

@skirpichev

This comment has been minimized.

Copy link
Contributor

skirpichev commented Mar 14, 2014

On Wed, Mar 05, 2014 at 09:23:12AM -0800, rfateman wrote:

It seems to me that you don't want peer-reviewing
if the person submitting the patch is a newbie.

Why do you think so?

I can quote this reference here:

Reviewers thus are an integral part of the development process. Note
that you do not have to have any special pull or other privileges to
review patches: anyone with Python on his/her computer can review.

So, we encourage you go to pull requests (or issues) and add
comments, if you have any objections.

That's an important part of my reply. Please ignore the rest,
I don't think that this language discussion does make any sense in this pr.
We are using the Python now, that's all.

And a lot of semantic rules, just like the python.
To the extent that Lisp is a functional language, or is used as a
functional language,
and the same for python, the meaning of programs should be easy to
understand. The term "functional language" is a technical term. I don't
know about "semantic rules" though.

I'm about the "special forms", like "if"s in Scheme.
(if a b c) and (+ a b c) share the common syntax, but evaluation
rules - very different.

Don't get me wrong, I share same respectful relationsips with lisps as
Matthew (and I like Scheme). The fact is - python is a popular
language, CLisp - is not.

If the rules for python continue to change, incompatibly, that is quite
unfortunate, don't you think?

Languages evolve with time. Not neecessary there is something
bad, it's life. BTW, the Python is backward-compatible: python3
and python2 - different languages.

The fact that you have to go outside the standard data model

I don't think so.

(two or one language)

I think, it's an additional barrier, which we don't have (thanks to
python). There is no such a barrier in Mathematica or Maple as well.
False. The Mathematica system see
http://reference.wolfram.com/mathematica/tutorial/TheSoftwareEngineeringOfMathematica.html
has hundreds of thousands of lines of C+ or C code.

Sorry, but this source asserts that the Mathematica has several
million LOC. Naturally, some parts written in C, e.g. language itself.
We don't take into account the python language codebase, when we are
talking about the Sympy. Same here.

is NaturalNumber.S a python object?

I don't understand what's this.

@rfateman

This comment has been minimized.

Copy link

rfateman commented Mar 14, 2014

On 3/14/2014 4:17 AM, Sergey B Kirpichev wrote:

On Wed, Mar 05, 2014 at 09:23:12AM -0800, rfateman wrote:

It seems to me that you don't want peer-reviewing
if the person submitting the patch is a newbie.

Why do you think so?
A peer of a newbie is a newbie. I think one would benefit from an
expert reviewing a patch.

In an extreme case, you could have two or a few newbies submitting
patches and approving each others' patches when they were either
wrong, or deliberately wrong, and hijacked the whole project.

I can quote this reference here:

Reviewers thus are an integral part of the development process. Note
that you do not have to have any special pull or other privileges to
review patches: anyone with Python on his/her computer can review.
Very egalitarian. How would you feel if the Food and Drug Administration
approved new drugs that way. Anyone can submit a drug, anyone can
review it?

So, we encourage you go to pull requests (or issues) and add
comments, if you have any objections.

That's an important part of my reply. Please ignore the rest,
I don't think that this language discussion does make any sense in
this pr.
We are using the Python now, that's all.

And a lot of semantic rules, just like the python.
To the extent that Lisp is a functional language, or is used as a
functional language,
and the same for python, the meaning of programs should be easy to
understand. The term "functional language" is a technical term. I don't
know about "semantic rules" though.

I'm about the "special forms", like "if"s in Scheme.
(if a b c) and (+ a b c) share the common syntax, but evaluation
rules - very different.
There are a handful of special forms, and frankly the new user of lisp is
unlikely to be confused by (defun ...) (if ....) (and ...) (or
...). Perhaps the
trickiest is (lambda ...). Others exist but are rarely used and
irrelevant for
newbies.
For a language like python, nearly everything other
than function call is a special form. For some languages the list of
binding powers
and precedences is substantial and permeates the understanding of infix
expressions.
Offhand, I think python has about 10 levels, C++ has much more,
Mathematica has
like 70 levels. (Most unused)

Don't get me wrong, I share same respectful relationsips with lisps as
Matthew (and I like Scheme). The fact is - python is a popular
language, CLisp - is not.
Arguing that something is good because it is popular? Look up
Argumentum ad populum in wikipedia

If the rules for python continue to change, incompatibly, that is quite
unfortunate, don't you think?

Languages evolve with time. Not neecessary there is something
bad, it's life. BTW, the Python is backward-compatible: python3
and python2 - different languages.
Unclear what you are saying. That it is a good thing that you write in
python 2
and it is a good thing that python3 is not compatible with it?

The fact that you have to go outside the standard data model

I don't think so.

(two or one language)

I think, it's an additional barrier, which we don't have (thanks to
python). There is no such a barrier in Mathematica or Maple as well.
False. The Mathematica system see

http://reference.wolfram.com/mathematica/tutorial/TheSoftwareEngineeringOfMathematica.html
has hundreds of thousands of lines of C+ or C code.

Sorry, but this source asserts that the Mathematica has several
million LOC. Naturally, some parts written in C, e.g. language itself.
The Wolfram people are pretty slippery. They would like you to believe
that it is mostly written in their
own language, but there are MILLIONS of lines of code in their C-like
implementation language.
Earlier (printed) reference manuals proudly tout this, before Wolfram
because pushing his
implementation language as the solution for everything.

Huge amounts, not a small core, are written in (essentially) C. My
understanding is the
modification to C is an attempt to provide a language which imposes
reference-counting
uniformly on the program.

We don't take into account the python language codebase, when we are
talking about the Sympy. Same here.

is NaturalNumber.S a python object?

I don't understand what's this.
I think you will have to look back at this thread. Is n*(n+1)/2 a
natural number if n is a natural number?

Whoever said sympy knows that seemed to use integer, assume n>0 and
NaturalNumber.s
and I don't know why.


Reply to this email directly or view it on GitHub
#2948 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment